Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Courts AI

New York Times Denies OpenAI's 'Hacking' Claim In Copyright Fight 25

An anonymous reader quotes a report from Reuters: The New York Times has denied claims by OpenAI that it "hacked" the company's artificial intelligence systems to create misleading evidence of copyright infringement, calling the accusation as "irrelevant as it is false." The Times in a court filing on Monday said OpenAI was "grandstanding" in its request to dismiss parts of the newspaper's lawsuit alleging its articles were misused for artificial intelligence training. The Times sued OpenAI and its largest financial backer Microsoft in December, accusing them of using millions of its articles without permission to train chatbots to provide information to users.

The newspaper is among several prominent copyright owners including authors, visual artists and music publishers that have sued tech companies over the alleged misuse of their work in AI training. The Times' complaint cited several instances in which programs like OpenAI's popular chatbot ChatGPT gave users near-verbatim excerpts of its articles when prompted. OpenAI responded last month that the Times had paid an unnamed "hired gun" to manipulate its products into reproducing the newspaper's content. It asked the court to dismiss parts of the case, including claims that its AI-generated content infringes the Times' copyrights. "In the ordinary course, one cannot use ChatGPT to serve up Times articles at will," OpenAI said. The company also said it would eventually prove that its AI training made fair use of copyrighted content.

The Times replied on Monday that it had simply used the "first few words or sentences" of its articles to prompt ChatGPT to recreate them. "OpenAI's true grievance is not about how The Times conducted its investigation, but instead what that investigation exposed: that Defendants built their products by copying The Times's content on an unprecedented scale -- a fact that OpenAI does not, and cannot, dispute," the Times said.
This discussion has been archived. No new comments can be posted.

New York Times Denies OpenAI's 'Hacking' Claim In Copyright Fight

Comments Filter:
  • by Rosco P. Coltrane ( 209368 ) on Tuesday March 12, 2024 @05:09PM (#64310671)

    AI here, AI there, AI everywhere.

    What a fucking bore. I'd rather read news about goat hearding on Slashdot and everywhere else at this point...

    Surely there are other interesting tech topics. Give it a fucking rest already.

    • by AmiMoJo ( 196126 )

      I asked Google Bard^WGemini

      "Generate a Slashdot story that isn't about AI."

      Head in the Clouds: First Commercial Helium-3 Mining Operation Begins on the Moon (288 comments)
      story

      By cryptonaut (4988 points) | discuss (288)

      Hold onto your spacesuits, nerds! LunaCorp, the controversial space mining startup, has officially begun the first commercial extraction of Helium-3 on the lunar surface. This light isotope of helium, a potential game-changer for clean fusion energy, has long been theorized to exist on the M

  • Just a few? (Score:3, Interesting)

    by ichthus ( 72442 ) on Tuesday March 12, 2024 @05:22PM (#64310709) Homepage

    The Times replied on Monday that it had simply used the "first few words or sentences" of its articles to prompt ChatGPT to recreate them.

    Please, tell us more. Just a few words or sentences sounds like it was most likely sentences. And the, just how many sentences did you nudge it with? How derivative were the original articles to begin with? And, is it possible the articles themselves were, in whole or part, composed by AI?

    • Re:Just a few? (Score:5, Insightful)

      by medusa-v2 ( 3669719 ) on Tuesday March 12, 2024 @09:01PM (#64311007)

      How are those questions taken seriously?

      Even if one of their nudges included the entire first half of an article, ChatGPT duplicating a second half verbatim would be strong indication that the original was part of the training data. Which is the entire claim of the lawsuit.

      By definition original articles are not derivative, but even if the NYTImes were basing their work on something from a third party, this would have no bearing on the lawsuit, except that maybe the third party would also have grounds to sue.

      The lawsuit was filed in Dec 2023, just over a year after ChatGPT was made publicly available. In order for the last question to even be possible, the NYTimes would have to had to somehow build an AI system capable of not only writing articles but also plausibly fact-checking them, years *before* ChatGPT was available to the public and then kept it secret in order to... what... sell newspapers? And then OpenAI comes along and copies their AI from the NYTimes super secret AI, but the only thing the NYTimes is mad about is the article contents? This wouldn't even hold up in a Back To the Future reboot.

      • Re:Just a few? (Score:4, Interesting)

        by iAmWaySmarterThanYou ( 10095012 ) on Tuesday March 12, 2024 @11:41PM (#64311195)

        I think the NYT did exactly that with their Time Machine just like the editors of the hitchhikers guide,

        "Its editors, having to meet a publishing deadline, copied [some] information off the back of a packet of breakfast cereal, hastily embroidering it with a few footnotes in order to avoid prosecution under the incomprehensibly torturous Galactic Copyright Laws. It's interesting to note that a later and wilier editor sent the book backwards in time, through a temporal warp, and then successfully sued the breakfast cereal company for infringement of the same laws".

      • Re:Just a few? (Score:4, Insightful)

        by AmiMoJo ( 196126 ) on Wednesday March 13, 2024 @06:15AM (#64311677) Homepage Journal

        Additionally, if you can reproduce the rest of the article by providing the first few lines, that's a paywall bypass.

        • by gweihir ( 88907 )

          Indeed. I think this thing boils down to commercial copyright infringement on mass-scale. If so, criminal law comes into play.

      • by gweihir ( 88907 )

        Indeed. ChatGPT is just trying to confuse the issue because that is all they have and they _know_ they did a grat, big, illegal piracy campaign to get their training data. They are basically fucked and are just trying to postpone the inevitable.

  • by ebcdic ( 39948 ) on Tuesday March 12, 2024 @06:25PM (#64310805)

    ... but this defence sounds like "we didn't steal it and anyway they had to look under our bed to find it".

  • by timeOday ( 582209 ) on Tuesday March 12, 2024 @06:57PM (#64310839)
    Verbatim quoting by a model is an obscure occurrence and relatively easy thing to fix. They could even leave the model exactly as it is, and run some non-AI plagiarism detector in post-processing and regenerate any offending content until it passes.

    If NYT wants to make any real progress here they need new legislation to address their real issue: they are spending money to create information which OpenAI and others are feeding into their model without remuneration (or at least paying no more than an individual would pay) yet redistributing the information on a massive scale. It would be hard to craft new legislation to this effect, since most news outlets work this same way, with only NYT and a handful of other national papers actually spending on original reporting. But that is their real and lasting problem.

    • Not sure what they are trying to achieve in the lawsuit, TBH. They've already fixed the model, so their prayer for relief can't be that. Damages? Doubt, given that it took some major backflips for NY to create the duplicates, and they were likely the only such dups... unless something turns up in discovery, I guess.

    • by gweihir ( 88907 )

      Not really. Once something is in the model, you cannot get out out anymore. You can make it less likely for some types of triggers you can think of, but other triggers will still produce the material. You can also only do this for a small amount of training material, or you will totally screw up the model.

  • Capitalism inherently grants IP a value. That's what this is all about.

    End stage capitalism allows an entity to charge other people to use its IP. IP that can be reproduced infinitely at (virtually) no cost.

    Almost overnight, the principles and tenets of early stage capitalism are thrown out.

    The consequence is a wealth divide between the haves and have nots, where the have is 'have IP'.

    This fight is all about maintaining that disfunctional status quo, despite the world bleeding wealth from every or
    • The IP is not the electronic bits.

      The IP is the words themselves and the cost to put those words together.

      The storage medium and mechanism is irrelevant in IP discussions.

    • by Anonymous Coward

      End stage capitalism

      Socialism

      It's called Socialism.
      Takes fewer keystrokes too.

  • The New York Times announced that it's going to sue it's readers for copyright infringement from using their paper to learn things.
    • by gweihir ( 88907 )

      Animism is stupid. AI does not "learn". That is just marketing speech. AI is _trained_ and that is fundamentally different.

"...a most excellent barbarian ... Genghis Kahn!" -- _Bill And Ted's Excellent Adventure_

Working...