On 6 May 2017, The Economist newsweekly published an article that inserted itself into the zeitgeist like nothing that venerable publication had ever published before. It was titled ‘The Worlds Most Valuable Resource is No Longer Oil, But Data’, and dealt for the most part with the antitrust implications of Big Tech. So visceral was the analogy that the phrase “data is the new oil” became the defining metaphor of the digital age.
The Economist was trying to make a limited point—that all new industries grow rapidly when they start out, only to be eventually reined in by regulators once they get too big: “A new commodity spawns a lucrative, fast-growing industry, prompting antitrust regulators to step in to restrain those who control its flow. A century ago, the resource in question was oil. Now similar concerns are being raised by the giants that deal in data, the oil of the digital era.”
But the analogy they hit upon was too powerful to remain confined to that narrow context. Before long, other parallels were being drawn—similarities between how tech smoothens the friction in our daily lives and the lubricant properties of oil—between the manner in which data fuels the modern economy just as oil did in the Industrial Age.
Oil has been so deeply integrated into the functioning of an economy for so long that its price has long been a key economic indicator. Oil companies were some of the most valuable on the planet, and—despite the dominance of tech giants—are still quite high up the order in terms of market capitalization today. But Big Oil has been dwarfed by Big Tech companies, some of which are among the wealthiest corporate entities to ever have existed. This is at the heart of the comparison between the oil industry and the data industry, but it is also where the similarity ends.
Despite the fact that Big Oil and Big Tech are among the most highly valued on earth, at their core the two industries couldn’t be more fundamentally different. Oil is a scarce natural resource extracted from deep inside the earth in a form that is largely useless until it has been thoroughly processed and refined. It is, by definition, finite—so much so that its market price fluctuates based on its availability. In its most common form, oil is a single-use product that literally has to be destroyed in order to release the energy that we use it for.
Data could not be more different. It is, for all practical purposes, infinite, limited only by our imagination—in terms of what exactly we want to measure and in how much detail. Unless fettered by regulation, it is unconstrained by geography—a fact that is borne out by the way in which Big Tech companies collect data from all over the world without ever having to leave the shores of the countries in which they are based. Finally, not only is data not destroyed when it is consumed, modern data technologies excel at allowing the same item of data to be used by many different people, either simultaneously or again and again, without any degradation in quality. For all these reasons, the value of data bears no relationship with scarcity, the cost of extraction or the lottery of geography.
In summary, data could not be more unlike oil. This, unfortunately, is a nuance that seems to have escaped those tasked with its regulation. Around the world, more and more laws are seemingly being written on the presumption that data sets are scarce natural resources whose benefits need to properly accrue to those who have contributed their personal data to it. And, by extension, the countries in which they are resident. This appears to be the line of thinking behind many recent judicial and regulatory developments, such as the second Schrems decision in the European Union and India’s inclusion of data localization provisions in its forthcoming privacy law. But this is an approach that fails to engage with the essential attributes of data, and, as a result, not only results in imperfect regulation, but also misses the opportunity to properly capitalize on all that data has to offer.
As long as the internet exists, data is capable of being accessed from anywhere. Regulations that require that data to be physically stored within the geographical boundaries of a particular country don’t take into account this essential attribute. We’d be far better off focusing on improving our ability to access all the data that we need, rather than assume that by forcing data to be localized, we will have access to all the data we want.
Finally, it is important to recognize that there is more that goes into the value of a data set than the elements of data of which it is comprised. It is also essential to understand how data is collected and arranged to be able to analyse what was collected. Determining what to measure, how to collect it, and the fields of data with which it should be associated, is a specialized skill. As is ensuring that a database once created remains free of bias and usable.
Unfortunately, our regulation of data is so focused on asserting control over the data sets accumulated by Big Tech companies that we have forgotten to build this muscle. If we want data to power decision-making, we need to ensure that instead of simply lifting and shifting data sets, we actually make the effort to learn what it takes to build data sets of our own—ones that are relevant to our own context. It is only once we make this shift in our thinking that we will become a data-first economy.
Rahul Matthan is a partner at Trilegal and also has a podcast by the name Ex Machina. His Twitter handle is @matthan