Lawsuit says OpenAI violated US authors' copyrights to train AI chatbot

Last updated on: 30 June,2023 09:55 am

Writers say ChatGPT mined data copied from thousands of books without permission

(Reuters) - Two U.S. authors sued OpenAI in San Francisco federal court on Wednesday, claiming in a proposed class action that the company misused their works to "train" its popular generative artificial-intelligence system ChatGPT.

Massachusetts-based writers Paul Tremblay and Mona Awad said ChatGPT mined data copied from thousands of books without permission, infringing the authors' copyrights.

Matthew Butterick, an attorney for the authors, declined to comment. Representatives for OpenAI, a private company backed by Microsoft Corp (MSFT.O), did not immediately respond to a request for comment.

Several legal challenges have been filed over material used to train cutting-edge AI systems. Plaintiffs include source-code owners against OpenAI and Microsoft's GitHub, and visual artists against Stability AI, Midjourney and DeviantArt.

The lawsuit targets have argued that their systems make fair use of copyrighted work.

ChatGPT responds to users' text prompts in a conversational way. It became the fastest-growing consumer application in history earlier this year, reaching 100 million active users in January only two months after it was launched.

ChatGPT and other generative AI systems create content using large amounts of data scraped from the internet. Tremblay and Awad's lawsuit said books are a "key ingredient" because they offer the "best examples of high-quality longform writing."

The complaint estimated that OpenAI's training data incorporated over 300,000 books, including from illegal "shadow libraries" that offer copyrighted books without permission.

Awad is known for novels including "13 Ways of Looking at a Fat Girl" and "Bunny." Tremblay's novels include "The Cabin at the End of the World," which was adapted in the M. Night Shyamalan film "Knock at the Cabin" released in February.

Tremblay and Awad said ChatGPT could generate "very accurate" summaries of their books, indicating that they appeared in its database.

The lawsuit seeks an unspecified amount of money damages on behalf of a nationwide class of copyright owners whose works OpenAI allegedly misused.

Recent Articles

Google to block news in Canada over law on paying publishers

LinkedIn ages like fine wine in Microsoft's cellar

Cambodia PM ditches Facebook as Meta mulls case over alleged threats

Chip industry globalisation under threat, says chief of China's YMTC