Encyclopedia Britannica and Merriam-Webster have sued OpenAI, accusing the AI company of “massive copyright infringement,” according to the complaint. Britannica, which owns Merriam-Webster, claims that OpenAI has used nearly 100,000 of its copyrighted online articles to train its LLMs without permission.
Britannica further alleges that OpenAI violates copyright laws by generating outputs that contain “full or partial verbatim reproductions” of its content and by utilizing its articles in ChatGPT’s RAG workflow. This tool allows the LLM to scan for newly updated information. Additionally, Britannica accuses OpenAI of breaching the Lanham Act when it creates false hallucinations attributed to the publisher.
The lawsuit claims that ChatGPT deprives web publishers like Britannica of revenue by providing responses that substitute their content, thereby competing directly against it. Britannica argues that ChatGPT’s inaccurate information jeopardizes the public’s access to reliable online information.
Britannica is not alone in its legal actions against OpenAI regarding copyright matters. Other entities such as The New York Times, Ziff Davis, and various newspapers in the US and Canada, including the Chicago Tribune and the Toronto Star, have also sued OpenAI.
Another lawsuit by Britannica against Perplexity is ongoing. Legal precedent regarding the use of copyrighted content for training LLMs is not well-established. However, in another case, Anthropic convinced a judge that using content as training data is transformative, yet was still fined $1.5 billion for illegally downloading books.
OpenAI did not provide a comment to TechCrunch prior to publication.
