TORONTO — A group of 14 major media companies are suing Cohere, claiming that the company is unfairly using their content to train its AI tools, then serving up full copies or cuts of their articles.
TORONTO — A group of 14 major media companies are suing Cohere, claiming that the company is unfairly using their content to train its AI tools, then serving up full copies or cuts of their articles.
TORONTO — A group of 14 major media companies are suing Cohere, claiming that the company is unfairly using their content to train its AI tools, then serving up full copies or cuts of their articles.
Toronto Star Newspapers, Vox Media and Condé Nast are among the publishers involved in the lawsuit, filed in the Southern District of New York on Thursday.
Talking Points
Cohere is engaging in “massive, systematic copyright infringement and trademark infringement,” the filing alleges. The publishers are seeking damages from the AI firm, as well as an order preventing it from using their content to train or fine-tune its large language models (LLMs).
Condé Nast titles like The New Yorker, Vanity Fair and Vogue “cannot live up to their exceptional standards if we allow their content to be stolen, distorted and trafficked,” said Condé Nast CEO Roger Lynch in a statement.
The lawsuit details examples of Cohere’s chat tool producing word for word or near-verbatim chunks of news stories in response to a prompt to provide the article. The suit claims that Cohere’s retrieval augmented generation (RAG) feature—designed to produce more accurate answers by connecting to external data sources—also copies stories that have just been published.
In one case, the filing alleges, Cohere copied into its system a Toronto Star article from October 2024 about Ticketmaster pausing ticket transfers for Taylor Swift’s Eras Tour to deter theft. The chat tool’s response to a query about the story mimics the flow of the piece, replicates sections verbatim and paraphrases others, according to the filing.
In addition to copying their work, the publishers claim Cohere’s chat tool sometimes “manufactures fake pieces and attributes them to us.” The firm has previously claimed its focus on RAG limits such hallucinations.
The lawsuit is “misguided and frivolous,” Cohere communications head Josh Gartner said. He said the firm “strongly stands by its practices for responsibly training its enterprise AI,” and has controls to “mitigate the risk of IP infringement” and respect holders’ rights. “We would have welcomed a conversation about their specific concerns—and the opportunity to explain our enterprise-focused approach—rather than learning about them in a filing,” Gartner said.
Cohere is the latest AI firm to be hit with a copyright infringement claim. Five large Canadian media organizations sued OpenAI in November. The ChatGPT maker is also facing claims from The New York Times as well as MediaNews Group and Tribune Publishing. Anthropic settled a suit with music publishers over song lyrics. And Toronto-headquartered Thomson Reuters this week won a U.S. case alleging copyright infringement by Ross Intelligence, a now-defunct legal AI startup.
Many of the publishers suing Cohere have licensed their content to other LLM makers. OpenAI has deals with Axel Springer, Condé Nast, The Atlantic, and Vox Media. Reddit has also licensed content to AI firms.
In an interview before the lawsuit was filed, Cohere co-founder Nick Frosst told The Logic the firm is not pursuing licensing deals. “A lot of those partnerships are based on a consumer platform,” he said. “They’re licensing deals for saying, ‘Our consumer chatbot’s going to have access to Reddit,’ or something. We’re not really interested in that.” Cohere focuses on business uses of LLMs. Frosst said its customers want AI that can access work tools like Salesforce and Google Drive rather than social media posts.
For clients that build applications on top of Cohere’s models, the firm offers indemnification from claims of IP infringement.
Cohere is also among several AI companies that have called for Canada to update its copyright rules to include a carve-out for text and data mining to train machine-learning systems. Japan and Singapore have similar exemptions. While the U.S. does not, its copyright law is “more flexible” than Canada’s, Cohere argued in a submission to a federal consultation last year.
Loading...
You have shared 5 articles this month and reached the maximum amount of shares available.
CloseIf you would like to purchase a sharing license please contact The Logic support at [email protected].
CloseYou have gifted 0 article(s) this month and have 5 remaining.
Recipients will be able to read the full text of the article after submitting their email address. They will not have access to other articles or subscriber benefits.
Get up to speed in minutes with insights and analysis on the most important stories of the day, every weekday.
See the bigger picture with reporters and industry experts in subscriber-exclusive events.
Membership provides access to our popular Slack channel, participation in subscriber surveys and invitations to exclusive events with our journalists and special guests.