Authors Charge Apple with Illegally Utilizing Pirated Books for AI Model Training

Authors Charge Apple with Illegally Utilizing Pirated Books for AI Model Training

Authors Charge Apple with Illegally Utilizing Pirated Books for AI Model Training


**Apple Faces Class Action Lawsuit Regarding Alleged Utilization of Pirated Books for AI Training**

A new proposed class action lawsuit has been submitted in the federal court located in Northern California, alleging that Apple has unlawfully employed copyrighted books for training its artificial intelligence models. The suit, initiated by authors Grady Hendrix and Jennifer Robertson, asserts that Apple made use of a dataset referred to as Books3, which allegedly comprises pirated works, including those authored by the plaintiffs.

### Authors Ground the Allegation in Apple’s Own Materials

The lawsuit references Apple’s own records related to its OpenELM language models, which were publicly disclosed on Hugging Face last year. According to the complaint, Apple is accused of integrating Books3 into its training datasets for OpenELM and possibly its Foundation Language Models too. The authors contend:

> “However, Apple is developing part of this new venture using Books3, a dataset comprising pirated copyrighted books that features the published works of Plaintiffs and the Class.”

The dataset in focus, Books3, is associated with another dataset named RedPajama, which has been recognized as including pirated content.

### Legal Requests

Hendrix and Robertson are looking to have the lawsuit designated as a class action, with themselves serving as representatives of the impacted authors. They are soliciting multiple remedies, which include:

– Class action designation with the plaintiffs as representatives.
– Statutory and compensatory damages, in addition to restitution and disgorgement.
– A lasting injunction against Apple to halt the alleged unlawful activities.
– Elimination of all AI models and training datasets that integrate the plaintiffs’ works.
– An award of legal expenses and fees.

### Context of the Lawsuit

This lawsuit arises amidst a backdrop where similar cases have resulted in varied outcomes. Recently, Anthropic settled a comparable lawsuit for $1.5 billion, underscoring the financial stakes that accompany copyright claims tied to AI training. Conversely, Meta recently won a case where a judge determined that its use of copyrighted books for AI training qualified as fair use, a ruling that has ignited discussions concerning the legality of such actions.

### Wider Implications

The ongoing conversations surrounding AI training and copyright have captured attention from numerous sectors, including notable figures. Former President Trump has remarked on the difficulties of compensating authors for every piece of material incorporated into AI training, implying that such a framework is unfeasible.

As the legal framework continues to change, the outcome of this lawsuit has the potential to establish significant precedents regarding how AI companies engage with copyrighted materials and the entitlements of authors in the digital era. The pressing question remains: should authors receive compensation for the use of their works in training AI models? The ramifications of this case could extend well beyond the courtroom.