Adobe Hit With Class-Action Lawsuit Over Alleged Use of Pirated Books to Train AI Model
The lawsuit alleges Adobe used unauthorised copies of books, including her own works, to train its SlimLM language model.
Adobe is facing fresh legal scrutiny over its use of artificial intelligence, as a new lawsuit alleges the company trained one of its AI models on pirated books without author consent.
A proposed class-action lawsuit filed on behalf of Oregon-based author Elizabeth Lyon claims Adobe used unauthorised copies of books, including her own works, to train its SlimLM language model.
SlimLM is described by Adobe as a small language model designed to support document-assistance tasks on mobile devices.
According to the complaint, SlimLM was pre-trained using SlimPajama-627B, a large open-source dataset released by AI chipmaker Cerebras in 2023. Lyon alleges that some of her copyrighted books were included in a subset of that dataset used by Adobe.
The lawsuit argues that SlimPajama is a derivative of the RedPajama dataset, which in turn allegedly incorporated the controversial “Books3” collection — a repository of roughly 191,000 books widely criticized for containing pirated material.
“The SlimPajama dataset was created by copying and manipulating the RedPajama dataset (including copying Books3),” the lawsuit states. “Thus, because it is a derivative copy of the RedPajama dataset, SlimPajama contains the Books3 dataset, including the copyrighted works of Plaintiff and the Class members.”
Books3 has become a recurring flashpoint in AI-copyright litigation. RedPajama has been referenced in lawsuits against Apple and Salesforce, both accused of training AI systems on copyrighted works without permission or compensation.
Recently, Adobe launched Adobe Photoshop, Adobe Express and Adobe Acrobat inside ChatGPT, allowing users to edit images, create designs and work with PDFs directly through OpenAI’s popular conversational AI platform.
Comments ()