Is treating data as property a good idea in the context of AI? Be careful what you wish for

Is property back in style? Are we now at the stage where, in the virtual world of data, property is also law? If this is the case, what does this mean for AI?

Author:  Argyri Panezi, Assistant Professor at IE Law School, and Professor at the Master in Legaltech. Research Fellow at Stanford University

The centuries-old notion of property is being called into question by scholars trying to make sense of it in digital contexts—such as data property and monopolies—and empower users whose data fuels the digital economy.

Most recently, Katharina Pistor argued that property is coded to rules that are also applicable to digital content, mainly through the current intellectual property frameworks [1]

The notion of property as code, thus being as powerful as architectural design, is powerful as it takes us back to Lessig’s critical claim that in cyberspace, code is law. [2]

Some scholars argue for the empowerment of users by property rights over their data. Eric Posner and Glen Weyl, for example, have proposed what they called a “radical model” for users to (re)gain control over their data.[3] In November 2019, an article in the New York Times presented Dawn Song’s research on computer security and trustworthy AI, discussing the promise of a “new paradigm” where data is viewed as property, giving people control over their data and the ability to be “compensated for its use by corporations.”[4]

But wasn’t property passé? In 2001, Thomas Merrill and Henry Smith published the article, “What happened to property in law and economics?” [5] in the Yale Law Journal, where they argued that law and economics schools contributed to an overall alienation from the in rem character of property. They claimed that Coase’s impact on the modern understanding of property rights is such that they are now commonly understood as nothing more than a “bundle of rights [6]. Furthermore, in a more recent work, Aaron Perzanowski and Jason Schultz explained how the traditional in rem dimension of property has actually been neglected on more occasions than we realize. [7] This is in large part due to the transformation of products from physical to digital and also the obscure boundaries of physicality that the Internet of Things causes.

So is property back in style? Are we now at the stage where, in the virtual world of data, property is also law? Does the scope of property cover—or strive to cover—all machine-readable data (for better or for worse)? If this is the case, what does this mean for AI?

Good reasons to avoid replicating the one-size-fits-all model of intellectual property

Assigning property rights to all computable data that are or could potentially be used for purposes of machine learning and AI development is not necessarily a good idea. On the one hand, the property language is powerful enough to provide currently less protected players (such as consumers) with control over their data, and perhaps promise stronger data privacy and data security protections. On the other hand, it is powerful enough to repeat maximalist enforcement trends similar to what we saw with copyright in the early days of the internet and the so-called “copyright wars.” [8] Property language can also be leveraged to shadow actions within the public domain, ensuring the open and free flow of data, as well as healthy competition within data-reliant markets.

If mapping the public domain has proved difficult in the context of a one-size-fits-all intellectual property system, with blurred lines between property and public domain [9] then why would we want to blindly replicate this model in another, much more complex virtual space?

One-size-fits-all in the context of intellectual property means that works that need the incentive of a property framework and works that do not are treated the exact same way. It means that content with both short and long average commercial lifespan is protected under the same (lengthy) term. What would a one-size-fits-all system of property protection for data look like? How would we ensure that it is not over- or under-inclusive? Would medical data and legal data have the same protection?

Compare the recent case Dinerstein v. Google, which dealt with patients’ medical data, with Georgia v. Public.Resource.Org, which discussed whether legal data can be copyrighted. The types of data in both cases are relevant to AI for the development of medical predictive tools and legaltech tools, such as predictive analytics. Depending on the nature of the data, there are different considerations that concern different, perhaps equally important principles: privacy; integrity; security; competition; transparency; justice; access to justice; and so forth.

The scope of this argument is relatively narrow. First, when it comes to the ownership of computable data, we should examine the pros and cons (such as costs) of using a powerful property framework with data. Second, we must exercise caution against the replication of the one-size-fits-all intellectual property model. Third, we must clarify (and perhaps redefine) the notion of property over yet another virtual, non-rivalry resource that is useful for the development of AI systems. The title of this short reaction harkens back to Zittrain’s older manifestation of concerns about internet governance [10] and issues that could perhaps be—sooner rather than later—critical in the conversation about AI governance.

This article was written by the author and published in Artificial Lawyer. 

Professor Panezi is an expert in law and technology and intellectual property. She specializes in Internet law and policy, intellectual property law, with an emphasis on digital copyright, as well as data protection, intellectual goods management, automation, machine learning and AI. Her current research focuses on digitization and AI. She has previously written on digitization, on copyright issues related to digital libraries, and on the European legal framework applicable to cultural heritage institutions. More recently she is also working on algorithmic control in the everyday life as well as the future of work.

Note: The views expressed by the author of this paper are completely personal and do not represent the position of any affiliated institution.

[1] Katharina Pistor, The Code of Capital: How the Law Creates Wealth and Inequality (2019), toying with Lessig’s famous Code is Law maxim.
[2] Lawrence Lessig, Code: And other laws of cyberspace (2009).
[3] Eric A. Posner and E. Glen Weyl, Radical Markets: Uprooting Capitalism and Democracy for a Just Society (2018).
[4] Craig S. Smith, Building a World Where Data Privacy Exists Online, The New York Times, Nov. 10, 2019, available here
[5] Merrill, Thomas W., and Henry E. Smith. “What happened to property in law and economics.” Yale LJ 111 (2001): 357
[6] Ibid, p. 365.
[7] Aaron Perzanowski & Jason Schultz, The End of Ownership: Personal Property in the Digital Economy (2016).
[8] See Peter Baldwin, The copyright wars: three centuries of transatlantic battle (2016).
[9] See among many Samuelson P., Challenges in Mapping the Public domain, in Hugenholtz, P.B. and Guibault (eds), 2006. The future of the public domain: identifying the commons in information law. Kluwer Law International, pp.7-21.
[10] Zittrain, Jonathan L. “Be careful what you ask for: Reconciling a global internet and local law.” Who rules the net (2003).