A class action lawsuit has been filed
in California federal court against OpenAI, the company that built ChatGPT. The lawsuit claims OpenAI scraped “massive amounts of personal data from the internet” and in using that data to train its AI models, it has violated copyrights and privacy of the innumerable internet users.
Microsoft, who has heavily invested in OpenAI and who has included the technology in a number of its products, was also named as a defendant in the suit. The lawsuit is seeking damages on behalf of the copyright owners.
Clarkson, the law firm behind the suit, has experience raising class-action lawsuits in multiple areas, from data breaches to false advertising. Authors Paul Tremblay and Mona Awad kicked off the lawsuit by claiming OpenAI used data from books available online without permission, infringing on the authors’ copyrights. Clarkson is actively seeking more plaintiffs.
Timothy K. Giordano, a partner with the firm
, says that “by collecting previously obscure personal data of millions and misappropriating it to develop a volatile, untested technology, OpenAI put everyone in a zone of risk that is incalculable – but unacceptable by any measure of responsible data protection and use.”
This is not the first time that OpenAI and Microsoft have faced a class-action lawsuit. In November 2022, a lawsuit alleged that OpenAI, Microsoft, and GitHub were using licensed code to train AI software, including GitHub’s Copilot code-writing tool.
Earlier this month, OpenAI was also sued for defamation
when ChatGPT produced text that wrongly accused a radio host in Georgia of fraud.
Although many large tech firms have the same practices of scraping data from the internet, Clarkson chose to engage with OpenAI due to its notoriety in leading the push for AI development.
Overall, legislation is lacking in this area, and regulators are discussing new laws that would require companies to provide more transparency around the data they use to build AI models. Some companies are also actively trying to stop firms from scraping their data.