News

The Scope of Copyright in the AI Economy

19.08.2024

I. Copyright in the AI economy: Github versus Copilot, the New York Times versus OpenAI and Suno, Udio and others

Modern generative AI is widely used and forms the basis for a variety of services and applications. Developers typically train the underlying neural networks with a wide range of data, which the AI uses to generate new content, such as text, images, music or other media. Since this training data is often protected by copyright, the lawful use of this data has been a concern for US courts for some time.

1. Copilot, Github and OpenAI versus Github users

GitHub, Microsoft and OpenAI are currently facing a class action lawsuit in the United States District Court for the Northern District of California. The plaintiffs allege these companies used their open-source software without consent to train the AI-based assistant “Copilot”. They claim that Copilot suggests code to users that is identical or similar to publicly available code of the plaintiffs. Recently, a California judge dismissed 20 of the 22 claims in the lawsuit. However, the court did not rule on whether Copilot’s method of generating code violates open-source licence terms, which usually require that the original author and licence text be retained in any distribution or derivative of the code.

2. The New York Times versus OpenAI

The New York Times newspaper initially took action against OpenAI on the grounds that OpenAI’s large language model had reproduced copyrighted texts from the newspaper’s articles. OpenAI then questioned the protectability of these texts and is now seeking access to the newspaper’s research documents, drafts and other materials, purportedly to assess whether the reproduced texts are indeed protected by copyright. So far, OpenAI has avoided invoking the fair use doctrine. This US legal principle allows the use of copyrighted works without the author’s permission if the use appears to be justified for purposes such as criticism, commentary, reporting, education or research. OpenAI’s reluctance to rely on fair use may stem from a desire to avoid a potentially unfavourable precedent regarding the fair use doctrine in the context of the training of large language models, which could have broader implications for the business models of AI providers.

3. Record labels versus Suno, Udio and others

Legal disputes are also emerging in the music industry between AI providers and rights holders. Specifically, audio AI services like Suno and Udio are facing litigation from music labels such as Sony Music, Atlantic Records, Capitol Records and Warner Records. Suno and Udio allow users to generate various music styles using text prompts, potentially also drawing on copyrighted music titles for their outputs. The core of these legal disputes according to German legal doctrine lies at the intersection of permissible adaptations and rearrangements (section 23 of the German Copyright Act (Urhebergesetz, UrhG)) and independent musical works (section 2(1), no. 2 of the German Copyright Act). This thus raises classic copyright issues, such as the required distinction between a new creation and the original work.

II. De lege lata – What legal tools does IP law provide to address AI exploitation?

In the AI context, copyright law has until now primarily served as a defensive tool for rights holders and established licensors against the use of relevant materials by new AI companies.

The boundary of copyright protection and the determination of whether a relevant exception applies in favour of the AI provider are highly case-specific. This leads to numerous legal disputes currently observed across various markets, including software, text and music. Under German law, the data mining exception (section 44b of the German Copyright Act) could become a focal point in such disputes. This exception, established in 2021 based on the European DSM Directive, generally allows large amounts of data to be used for machine learning purposes without prior permission from the copyright holder. However, copyright holders have an opt-out right under section 44b of the German Copyright Act, which is intended to be implemented through a machine-readable reservation of rights in the future. This approach though might not yet be widely implemented in practice.

When considering whether German law, with its data mining exception, or US law, with a possible fair use exception, applies, the “country of protection” principle must also be considered. This principle is applied in international copyright law and generally ties the applicable law to the place where the relevant use (e.g. the training of an AI) takes place. If these processes are controlled or processed on different servers in various locations, the copyright laws applicable at each location must therefore be adhered to. This could result in multiple copyright regimes requiring simultaneous consideration.

III. Further options for AI users

AI applications and providers have long since established themselves across a wide range of markets. While copyright law provides substantial tools for protecting rights holders against the use of their protected materials by AI, the innovative applications that AI offers are likely to establish themselves successfully in the markets over time.

An ideal solution to such disputes is therefore likely to be found in many cases in contract law. In the music industry, for instance, the company Landr is taking a practical approach by allowing musicians to upload their copyrighted works to the company’s platform specifically for the training of AI tools. In exchange for granting these rights, musicians receive a direct share of the profits generated by these tools.

A similar contractual solution to the existing legal uncertainties can be seen in the strategic partnership between the American magazine Time and the AI company OpenAI. Time provides OpenAI with access to current and historical data and content from the magazine. On the one hand, this data is used to improve OpenAI’s AI capabilities, and on the other hand, it assists the magazine in improving its journalistic products through ChatGPT and other OpenAI tools, and in developing AI-optimized products. This represents a classic win-win model.

Whether these or other contractual solutions will prevail remains to be seen. In any case, it is worthwhile for all companies in copyright-related industries, be it software, music, text, or other areas, to closely examine if and how they are currently contractually and legally protected against AI use of their materials. Additionally, they should assess whether and how they have already addressed issues such as those arising in the context of data mining under section 44b of the German Copyright Act.