Comedian Sarah Silverman and two other authors sued OpenAI and Meta in federal court on Friday. The suits allege that the tech companies infringed the authors’ copyrights while training their artificial intelligence products.
OpenAI’s ChatGPT, released November 2022, and Meta’s LLaMA, released February 2023, are large language models (LLMs) designed to mimic human intelligence by conversing with users. LLMs are trained on large datasets in order to appear knowledgeable on a variety of subjects. An LLM’s outputs rely on the information contained within its training datasets to produce an intelligible response.
According to the complaint against OpenAI, ChatGPT will summarize each author’s book when prompted: Sarah Silverman’s Bedwetter, Christopher Golden’s Ararat, and Richard Kadrey’s Sandman Slim. Silverman, Golden, and Kadrey assert that they never gave permission for their works to be used to train ChatGPT, and yet the AI program’s ability to summarize their books supports the idea that their books were used in training sets without their permission.
The plaintiffs believe that Meta acquired their books to add to LLaMA’s training datasets without their permission through illicit sources. For instance, one of the sources Meta notes in a paper about LLaMA’s training dataset is ThePile, which, according to the complaint, is comprised of “a copy of the contents of the Bibliotik private tracker,” a shadow library. They allege that this shadow library contains unauthorized copies of their works.
The complaints against Meta and OpenAI assert direct copyright infringement, vicarious copyright infringement, removal of copyright-management information (and false assertion of copyright in the complaint against Meta), unfair competition, unjust enrichment, and negligence. Both complaints seek class certification for all people or entities in the U.S. that own a U.S. copyright in any work used as training data by Meta or OpenAI.
A few weeks ago, more than 5,000 authors, including Margaret Atwood and Jodi Picoult, signed a letter to OpenAI, Meta, Alphabet, IBM, and other AI developers accusing them of wrongfully feeding their AI products copyrighted works in order to “mimic and regurgitate [their] language, stories, style, and ideas.” The letter urges the CEOs of these companies to obtain consent, credit, and compensate writers when their works are used to train AI programs.
Sarah Silverman and novelists sue ChatGPT-maker OpenAi for ingesting their books, AP News (July 12, 2023)
Sarah Silverman is suing OpenAI and Meta for copyright infringement, The Verge (July 9, 2023)
Image Credit: Song_about_summer / Shutterstock.com