The Open Data

Home/Blog/Article 02

It's the open data economy not the internet

I’ve watched the internet grow from squabbles about who built the first e-commerce site (hothothot.com, in Pasadena, CA) to a multi-trillion dollar economy. I’ve been doing this entrepreneur thing for over 30 years, from building real-time trading systems for wall street when ‘digital’ was only a buzzword that nobody understood to debating about the ethics of using cookies to track web site visits to whether or not Javascript was even a programming language. I survived the first internet bubble swimmingly, building a chat based customer service technology that is now the default for online support. And now, the Boston Consulting Group is reporting that the internet economy contributes from 5% to 9% of GDP in established markets and in developing markets it is growing between 15% and 25% annually. I’ve literally watched the internet grow up from an idea to the Open Data Economy.


At the root of this incredible economic engine is Social Production, where users are given “free” access to technology and use that technology to create Open Data. Open Data is defined as (from wikipedia, itself an example of social production) data that ”is freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control”. Social Production is the organization of “micro-work” (again from wikipedia) “a series of small tasks which together comprise a large unified project, and are completed by many people over the Internet”. Micro-work can be something as simple as creating a link to this web page, or posting an image on Flickr (1.8 billion images are posted every day on the internet), sending a text or writing a blog article like this one. Social Production also requires social capital which represents the value of the participants social relationships. So, Micro-work + Social Production= Open Data, the building blocks of the economic engine that is the foundation of the Open Data Economy.

There are three ways companies create value in this Open Data economy. The first is to aggregate and analyze the micro work ouput and to sell the results of that analysis. For example, Google Ad-Words leverages freely available information (social production) by ‘crawling’ and ‘indexing’ the web’s publicly visible content (open data). The second is to derive another layer of value by mining value from the output of the micro work analysis. These mining techniques were specifically developed for the collection and analysis of data sets too large for traditional data processing, otherwise referred to as “big data”. Facebook’s Social Graph and the products they sell for targeted advertising are an example of the collection and analysis of the social production of its members. The third value creation strategy is used by companies such as Amazon who add value to their products and services by aggregating micro-work in the form of individual product reviews. The value created in the Open Data Economy is the result of literally millions of volunteer micro-workers.


Let's look at some of the technology upon which this Open Data Economy is built, specifically HFS, or Hierarchical File System. The first step to understanding the role of HFS is to understand that the World Wide Web (or just web) is not the internet. The web is a network of documents stored on a network of computers called the internet ( a global system of interconnected computer networks that use the standard Internet protocol suite (TCP/IP) to link several billion devices worldwide - Wikipedia). It was originally built as an easy way to share documents between the computers connected by the network. The HTTP (built on top of TCP/IP) protocol made it possible to connect a network of documents stored on the Internet. The way this works is that web browsers access files to share by connecting to applications called web servers using that HTTP protocol. When I type in a URL (Uniform Resource Locator) into the URL field at the top of the browser window, I'm literally saying: 'go get me that file on that computer at that address'. Literally, go make a copy of that file, and display it in my browser.

Configuring web servers involves primarily identifying the files you wanted to share and getting their permissions set correctly, so that the web server can share the file (make copies of) and send them to the browsers requesting to view them.

All modern operating systems (Linux, Mac, Windows) use the same basic file system (or a file system with similar features), called HFS. At the heart of the hierarchical file system, is what is known as the Access Control List, a list of permissions attached to the file that identify which users (or computer processes) are granted access to the files. So configuring a web server is about making sure the web server process had read access to the desired files. Once the file leaves control of the computer (via the web server, the web browser) the original computer no longer has ANY control over the file permissions. In effect the web server interprets the READ permission, as 'go ahead and start making copies of the file'. And this is exactly what it does, it reads the file from the disk, and writes the file onto the wire and send it via the Internet to the requesting computer, where the file is displayed in your local browser. The browser now has control of the copy it has made, which is a semi-permanent copy in it's cache. The file has been copied from the web server to your local disk. Since the file is beyond the control of the original computer it can be replicated and or modified. This fundamental property of making copies is necessary otherwise local applications cannot reliably access the file: i.e. In the event of a network outage and a host of other events. It is critical that the browser have local read access to the branch in a hierarchy or its parent. All these conditions or "features" are necessary to the proper operation of the entire system.

The web is a network of documents stored on a network of computing devices called the Internet. The browser cookie is the method used to collect the micro-work of social production. The Open Data Economy is therefore dependent on this voluntary surrender of control of the files. The ability to make copies is an inherent component of the architecture needed for the Web to operate, which is a reflection of the idiosyncrasies of the permissions system of HFS. HFS is therefore the foundation of the Open Data Economy.


We learned about what we have today which is the Open Data Economy

  • Open Data Economy is the economies of open data and value-creation strategies around it.
  • Social Production is the organization of micro-work, a series of small tasks by a large group
  • There are many ways to monetize social production: aggregate, mining and by analyzing and adding value
  • The Hierarchical Filesystem is the foundation of the Open Data Economy
The Open Data economy and Social Production have created some incredible and truly useful things, this website for example, Wikipedia, the search engine used to find this article, Twitter where this article was announced as being available. All incredibly valuable services that I use every day.

I’m not making a value judgement about whether or not Open Data is good or bad, however there are some realities that we need to be aware of. The Open Data economy and Social Production have created some incredible and truly useful things, this website for example, Wikipedia, the search engine used to find this article, Twitter where this article was announced as being available. All incredibly valuable services that I use every day. But there are consequences to data namely that it is Public-By-Default, and I’ll begin to explore these consequences in my next article.