News Orgs Say AI Firm Stole Articles, Spit Out 'Hallucinations' (arstechnica.com) 20

Posted by BeauHD on Thursday February 13, 2025 @07:40PM from the here-we-go-again dept.

An anonymous reader quotes a report from Ars Technica: Conde Nast and several other media companies sued the AI startup Cohere today, alleging that it engaged in "systematic copyright and trademark infringement" by using news articles to train its large language model. "Without permission or compensation, Cohere uses scraped copies of our articles, through training, real-time use, and in outputs, to power its artificial intelligence ('AI') service, which in turn competes with Publisher offerings and the emerging market for AI licensing," said the lawsuit (PDF) filed in US District Court for the Southern District of New York. "Not content with just stealing our works, Cohere also blatantly manufactures fake pieces and attributes them to us, misleading the public and tarnishing our brands."

Conde Nast, which owns Ars Technica and other publications such as Wired and The New Yorker, was joined in the lawsuit by The Atlantic, Forbes, The Guardian, Insider, the Los Angeles Times, McClatchy, Newsday, The Plain Dealer, Politico, The Republican, the Toronto Star, and Vox Media. The complaint seeks statutory damages of up to $150,000 under the Copyright Act for each infringed work, or an amount based on actual damages and Cohere's profits. It also seeks "actual damages, Cohere's profits, and statutory damages up to the maximum provided by law" for infringement of trademarks and "false designations of origin."

In Exhibit A (PDF), the plaintiffs identified over 4,000 articles in what they called an "illustrative and non-exhaustive list of works that Cohere has infringed." Additional exhibits provide responses to queries (PDF) and "hallucinations" (PDF) that the publishers say infringe upon their copyrights and trademarks. The lawsuit said Cohere "passes off its own hallucinated articles as articles from Publishers." Cohere said in a statement to Ars: "Cohere strongly stands by its practices for responsibly training its enterprise AI. We have long prioritized controls that mitigate the risk of IP infringement and respect the rights of holders. We would have welcomed a conversation about their specific concerns -- and the opportunity to explain our enterprise-focused approach -- rather than learning about them in a filing. We believe this lawsuit is misguided and frivolous, and expect this matter to be resolved in our favor."

Further reading: Thomson Reuters Wins First Major AI Copyright Case In the US

News Orgs Say AI Firm Stole Articles, Spit Out 'Hallucinations'

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 20 Comments Log In/Create an Account

Comments Filter:

This is rich. (Score:5, Interesting)

by zooblethorpe ( 686757 ) writes: on Thursday February 13, 2025 @07:56PM (#65165197)

That last sentence of Cohere's statement just drips with liquefied bullshit.
We would have welcomed a conversation about their specific concerns -- and the opportunity to explain our enterprise-focused approach -- rather than learning about them in a filing.
Indeed. <shakes head/>
Let's look at this more closely:
We would have welcomed a conversation about their specific concerns

How about, "You're appropriating our content en masse, reworking it, and passing off the resulting mish-mash shit as somehow our product and not yours." Sheesh.
-- and the opportunity to explain our enterprise-focused approach --

Translation: "We're making a business out of repurposing your content and misrepresenting the results."
rather than learning about them in a filing.
This is like playing the victim because the targets of your burglary had the audacity to file charges, instead of coming to you nicely and politely to talk about how much they really would have preferred it if you hadn't ransacked their house.
I understand that there are oodles of issues with current copyright law. That said, I don't think these AI shysters are being unjustly attacked for what certainly looks like their wholesale (soon to be retail!) misuse of content.

- Re: (Score:3)
  
  by stabiesoft ( 733417 ) writes:
  
  It is classic tech. Do it and ask forgiveness. They would not want me on the jury. I'd go for the 150K * number of infringements. I'd bk them. And set an expectation that other juries will too.
  - Re: (Score:1)
    
    by africanrhino ( 2643359 ) writes:
    
    It’s unreasonable, 99%of content production is reworking the work of others 99% of the time, it’s just rewriting existing work that was gleamed elsewhere..
    - Re: (Score:2)
      
      by jvkjvk ( 102057 ) writes:
      
      It is certainly not unreasonable.
      They egregiously broke copyright law. Egregiously. And profited off of that. They have no remorse for what they did and want to continue into the future, not giving content creators anything.
      F* them.
      This isn't about "reworking the work of others". It's wholesale theft. If you can't see the difference you are too far gone. Cult much?
- Re: (Score:1)
  
  by TheWho79 ( 10289219 ) writes:
  
  Then why don't they man up and take on Google? Google says they have had AI algo's running in its systems since 2006. They do EXACTLY the same thing that cohere did. Hell, for 20 years Google even REPUBLISHED in full entire websites under a falsely labeled link "cached page". If Cohere is guilty of anything, it is being the small fish in the big sea. If any of these suits win, the next day there should be a million suits against Google.
  - Re: (Score:2)
    
    by Gideon Fubar ( 833343 ) writes:
    
    yes?
    who is it that you think you're undermining by saying this?
    - Re: (Score:1)
      
      by TheWho79 ( 10289219 ) writes:
      
      The original comment
      > How about, "You're appropriating our content en masse, reworking it, and passing off the resulting mish-mash shit as somehow our product and not yours." Sheesh.
      Which is exactly what Google did. So you gotta ask, why do these suits not name Google in all this?
      - Re: (Score:2)
        
        by Gideon Fubar ( 833343 ) writes:
        
        Uh.
        No, I don't need to ask.
        I think that it's stupid that Google already have publishing deals with these companies, but I know they do. You could have found this out if you'd looked.
  - Re: (Score:2)
    
    by butlerm ( 3112 ) writes:
    
    "Republishing" a web page verbatim - advertising and all - as associated with the original URL and as a backup for a way to click through to access the current article if it is still being made available to anyone who walks by is a lot closer to fair use than what most of these AI companies are doing. No one would use the cached version if the original publisher didn't take it down, or shut the website down, or was acquired by a new owner that wanted it placed behind a paywall, or changed the URL for no pa
    - Re: (Score:1)
      
      by TheWho79 ( 10289219 ) writes:
      
      We scraped millions of pages out of googles "cached" page
- Re: (Score:3)
  
  by Visarga ( 1071662 ) writes:
  
  > Translation: "We're making a business out of repurposing your content and misrepresenting the results."
  
  This is wrong. Their business is to create AI tools, it is users who prompt models to do anything they do. LLMs are the worst infringement tools ever invented - they almost never reproduce exactly, cost money and are slow, while good old copying is free, instant and has perfect fidelity. Who would use AI to infringe?
Wow, it must've really been egregious (Score:3)

by 93 Escort Wagon ( 326346 ) writes: on Thursday February 13, 2025 @09:36PM (#65165381)

I mean, The Atlantic and The Republican are suing them together!?

- Re: (Score:2)
  
  by TrumpShaker ( 4855909 ) writes:
  
  Hey, when your enemy is the enemy of your other enemy, they are your frenemy?
  - Re: (Score:2)
    
    by nightflameauto ( 6607976 ) writes:
    
    Sabbat: 'The Best of Frenemies' was a great tune.
- Re: (Score:2)
  
  by Archtech ( 159117 ) writes:
  
  Many Americans avoid strong past tenses. Count your blessings it did't say "spitted".
On the other hand (Score:2)

by Visarga ( 1071662 ) writes:

On the other hand Conde Nast is always truthful, no hallucinations or misinformation. No siree!
Statistical Data Points, Right? (Score:4, Interesting)

by Musical_Joe ( 1565075 ) writes: on Friday February 14, 2025 @08:59AM (#65166061)

I know this is Slashdot, but I DID actually look at the linked examples whereby the AI tool replicated entire (or almost entire) articles. And... as long as the evidence isn't completely fabricated, it's genuinely surprising. People who take the position LLM operators always say that it's just statistical data points and the model doesn't contain enough of the original to be able to spit out anything that would count as infringing. Well... not in this case, no sirree - we're talking multiple paragraphs of text lifted verbatim. For one, I cannot see any "fair use" defence (or any other kind) here. And two, a question: is there something about this specific model that enabled it to quote such large chunks of text, or is this an eye-opener that ALL LLMs might do the same thing?

AI as a Plagerism Unbrella (Score:2)

by BrendaEM ( 871664 ) writes:

I didn't know it was loaded. It's a bug in the software. It's out of control. I was only following orders. I've got to go to the bank.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

News Orgs Say AI Firm Stole Articles, Spit Out 'Hallucinations' (arstechnica.com) 20

News Orgs Say AI Firm Stole Articles, Spit Out 'Hallucinations' More Login

News Orgs Say AI Firm Stole Articles, Spit Out 'Hallucinations'

This is rich. (Score:5, Interesting)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3)

Wow, it must've really been egregious (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

On the other hand (Score:2)

Statistical Data Points, Right? (Score:4, Interesting)

AI as a Plagerism Unbrella (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot