Generative AI Copyright Concerns & 3 Best Practices in 2024

Updated on Jan 2

6 min read

Table of contents

1- Are AI-generated works eligible for copyright protection?2- Who is the owner of the generative AI copyright?3- Can copyright-protected data be used as training data?4- Generative AI copyright best practices 5- For more on generative AI

As with any new technology, generative AI also hosts legal and ethical issues that must be addressed. One of the most important of these is the question of copyright, which determines who owns the rights to creative works and how to use them. Companies relying on generative AI without knowing the local legislation about generative AI copyright are risking reputation issues or legal fines.

In this article, we will explore generative AI copyright concerns around it in 3 topics:

The eligibility of AI-generated works for copyright protection
The ownership of this copyright
Use of copyrighted works as input in AI training algorithms

1- Are AI-generated works eligible for copyright protection?

Whether AI-generated works are eligible for copyright protection varies from each country. However, in general, we can say that substantial human involvement is required for its eligibility.

There are doubts about whether works generated by AI tools should be eligible for copyright protection at all. The possible options are:

AI-generated works do not apply to the copyright protection requirements because they are not the result of human creativity.
AI-generated works should be eligible for copyright protection because they are the product of complex algorithms and programming. Moreover, the creators of these algorithms and programs should be recognized as the authors of the works generated by the AI.

For example, in the United States, copyright laws do not protect works created solely by a machine. But if an individual can demonstrate substantial human involvement in its creation, then it is plausible they may receive copyright protection.

AI assisted artwork received copyright protection

In September 2022, the US Copyright Office made history when they issued an unprecedented registration for a comic book named Zarya of the Dawn.¹ The book was developed using text-to-image AI tool Midjourney (see Figure 1). The author declared that the artwork was AI-assisted rather than solely generated by the AI. In addition to AI generated images, she crafted and structured the story, designed each page’s layout and made artful decisions to arrange all of its components.

Figure 1. Drawings from the last page of AI-generated comic book Zarya of the Dawn. (Source: Zarya of the Dawn)

Award winning Midjourney image was denied copyright protection

Another controversial generative art example is an AI-generated print that won an art fair competition at the Colorado State Fair.² The creator expressed that he spent numerous weeks curating the perfect prompts and manually identifying the finished product. The award-winning AI generated art is shown in Figure 2 below.

Figure 2. The award-winning AI-generated print Theatre d’Opera Spatial. (Source: The Verge)

This image was denied copyright protection.³ Ultimately, whether AI-generated works are eligible for copyright protection brings the question of ownership rights and who would own the copyright in these cases. Countries requiring a human agency for authorship generally deny copyright protection of AI-generated works.

2- Who is the owner of the generative AI copyright?

The authorship and ownership rights are also disputable.

Under the copyright law of most countries, the creator of a work is generally considered the copyright owner. However, when a work is created by AI, it is unclear who the creator is. Such ambiguity can create problems in determining who has the right to exploit the work, and in enforcing copyright violations.

There can be different solutions to this problem:

AI itself as the creator of the work, in which case the AI owner would have the copyright.
AI model’s human programmer as the creator, in which case the programmer would be the owner of the copyright.
Humans that prepared the AI model’s training data as the creators.

AI or its developer as the copyright holder

Various countries, including Hong Kong, India, Ireland, New Zealand and the UK, explicitly grant authorship rights to programmers. For example, the United Kingdom provides copyright for works created entirely by computers. Yet, it deems that the author should be “the person by whom the arrangements necessary for the creation of the work are undertaken.”⁴

Consequently, there are several interpretations of whom this “person” refers to. The generative model’s developer or operator? Or the model itself?

Stephen Thaler, an AI inventor and creator of the Creativity Machine, is taking on the attitude of the US Copyright Office towards AI authorship. He filed a lawsuit in June 2022 against the copyright office due to its refusal to register a digital image developed using his system. Rather than claim authorship himself, Thaler asked that recognition be given solely to his Creativity Machine as the creator behind it. “My interest is the definition of what a person is,” Thaler said in an interview with Bloomberg Law.⁵ “What I’m building, what many will argue, is sentient machine intelligence. So maybe expansion to the term sentient organism would be in order.”

Humans creating the training data as copyright holders

Take the case where the authorship and copyright are given to the model programmer. Besides the programmed algorithm, generative AI models rely on an immense number of data for creating new content. For example, look at the Next Rembrandt painting in the figure below.

Figure 3. “The Next Rembrandt” is a computer generated 3D painted painting which fed on the real paintings of 17th century Dutch painter Rembrandt. (Source: The Guardian)

Given this highly artistic output, it is hard to give authorship solely to the programmer while bypassing the immense input from the real artist Rembrandt.

Currently, the ownership of copyright is as debatable as the eligibility of the generated works. Both vary from country to country and are open to reform according to the improvements in generative AI use.

3- Can copyright-protected data be used as training data?

Another question about the copyright problem in the context of AI models is the usability of copyright-protected data. In short, if it can be subsumed under fair use, there is no problem. However, in this case, the line between fair use and copyright infringement is a bit blurred in most jurisdictions except Japan where copyrighted work can be used for generative AI.⁶

Fair use vs copyright infringement

Intellectual Property law is a special set of legislation safeguarding and enforcing the rights of creators and owners of creative works such as inventions, writings, music, designs and other intellectual property.

Copyright infringement is a serious crime that can result in imprisonment. The ignorance of IP law while using copyrighted material will not excuse anyone’s liability or organize any kind of legal defense against claims made by copyright owners.

Fair use doctrine allows for limited use of copyrighted material without needing permission from the copyright holder if said usage falls under certain categories, such as

criticism/commentary
news reporting
teaching
research

For instance, using copyrighted material for educational purposes could qualify for fair use, whereas using copyrighted material for commercial purposes without permission from the copyright holder would be considered copyright infringement.

Copyrighted data for training purposes: fair use or copyright infringement?

Training AI models on copyrighted data will likely be considered fair use. Yet, the same cannot necessarily apply to generating content. To put it more clearly: you can utilize someone else’s data in order to train AI models in alignment with your needs. However, what you do with the generated output of this model might infringe copyright law.

OpenAI expressed that using ML algorithms for training AI programs by examining copyrighted data should be considered fair use.⁷

Another important factor to consider when assessing fair use is if academic researchers and nonprofit organizations have produced the training data and models or not. Startups are well aware of this as it tends to reinforce their fair use defenses.

As an example, Stability AI – the distributor of Stable Diffusion – neither collected the model’s training data nor trained the models. Rather, it funded and coordinated this work with academics. The Stable Diffusion model is licensed by a German university, enabling Stability AI to transform its creation into a commercial service while remaining legally separate from it.⁸

However, when AI-generated works are copyrighted and then used for training sets, a legal conundrum can arise if the original creator did not license its use in such a way. To ensure that laws around copyright and fair use are respected, producers of generative AI content should demonstrate due diligence in obtaining proper licenses when possible.

Can AI-created data be used as training data?

Once courts clarify all of the open questions above, we will also have the ingredients to determine how AI-created data can be used for training and how the trained models can be used. This is a critical question as AI-created data is already fueling a generative AI boom. ⁹

4- Generative AI copyright best practices

We recommend businesses 2 primary steps:

Identify your businesses’ risk appetite for generative AI

This would lead to identifying which use cases make sense for your business. For example, you may not want generative AI code in your businesses’ most valuable proprietary code.

Leverage vendor commitments to minimize your business’ risk

To minimize enterprises’ concerns, vendors like Adobe and Microsoft are committing to defend their clients in case use of their solutions lead to legal issues. ¹⁰

Embrace ethical AI

While building your business’ generative AI stack, pay attention to the data used in training and fine-tuning generative AI models from a copyright perspective.

Learn more about our recommendations regarding enterprise generative AI.

5- For more on generative AI

If you questions about generative AI or need help in finding vendors, reach out:

Find the Right Vendors

External Links

1. “Artist receives first known US copyright registration for latent diffusion AI art.” Ars Technica, 22 September 2022. Accessed 1 January 2023.
2. “Artwork generated using AI software Midjourney won a state competition.” The Verge, 1 September 2022. Accessed 1 January 2023.
3. Brittain, Blake(September 7, 2023). “US Copyright Office denies protection for another AI-created image“. Reuters. Accessed September 10, 2023
4. “Artificial Intelligence and Intellectual Property: copyright and patents.” GOV.UK, 28 June 2022. Accessed 1 January 2023.
5. “Artificial Intelligence Can Be Copyright Author, Suit Says (1).” Bloomberg Law, 3 June 2022. Accessed 1 January 2023.
6. 決算行政監視委員会分科会質疑を振り返る。. 24 April 2023. Accessed 11 June 2023.
7. “Before the United States Patent and Trademark Office Department of Commerce Comment Regarding Request for Comments on Intell.” USPTO. Accessed 1 January 2023.
8. “Revolutionizing image generation by AI: Turning text in … – LMU Munich.” LMU München, 1 September 2022. Accessed 1 January 2023.
9. “lamini-ai” Github, Accessed 23 May 2023.
10. “Microsoft announces new Copilot Copyright Commitment for customers“. Microsoft. September 7, 2023. Accessed September 10, 2023

Access Cem's 2 decades of B2B tech experience as a tech consultant, enterprise leader, startup entrepreneur & industry analyst. Leverage insights informing top Fortune 500 every month.

Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 60% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE, NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and media that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised businesses on their enterprise software, automation, cloud, AI / ML and other technology related decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.

To stay up-to-date on B2B tech & accelerate your enterprise:

Follow on

Comments

Your email address will not be published. All fields are required.

0 Comments

Related research

Compare Top 15+ Legal AI Software: Key Features & Pricing in '24

Apr 2312 min read

Generative AI: 7 Steps to Enterprise GenAI Growth in 2024

Jan 126 min read