OpenAI Disputes Authors' Claims That Every ChatGPT Response is Derivative Work 119

Posted by msmash on Wednesday August 30, 2023 @04:00PM from the let-the-game-begin dept.

OpenAI has responded to a pair of nearly identical class-action lawsuits from book authors -- including Sarah Silverman, Paul Tremblay, Mona Awad, Chris Golden, and Richard Kadrey -- who earlier this summer alleged that ChatGPT was illegally trained on pirated copies of their books. From a report: In OpenAI's motion to dismiss (filed in both lawsuits), the company asked a US district court in California to toss all but one claim alleging direct copyright infringement, which OpenAI hopes to defeat at "a later stage of the case." The authors' other claims -- alleging vicarious copyright infringement, violation of the Digital Millennium Copyright Act (DMCA), unfair competition, negligence, and unjust enrichment -- need to be "trimmed" from the lawsuits "so that these cases do not proceed to discovery and beyond with legally infirm theories of liability," OpenAI argued.

OpenAI claimed that the authors "misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence." According to OpenAI, even if the authors' books were a "tiny part" of ChatGPT's massive dataset, "the use of copyrighted materials by innovators in transformative ways does not violate copyright." Unlike plagiarists who seek to directly profit off distributing copyrighted materials, OpenAI argued that its goal was "to teach its models to derive the rules underlying human language" in order to do things like help people "save time at work," "make daily life easier," or simply entertain themselves by typing prompts into ChatGPT.

The purpose of copyright law, OpenAI argued is "to promote the Progress of Science and useful Arts" by protecting the way authors express ideas, but "not the underlying idea itself, facts embodied within the author's articulated message, or other building blocks of creative," which are arguably the elements of authors' works that would be useful to ChatGPT's training model. Citing a notable copyright case involving Google Books, OpenAI reminded the court that "while an author may register a copyright in her book, the 'statistical information' pertaining to 'word frequencies, syntactic patterns, and thematic markers' in that book are beyond the scope of copyright protection."

OpenAI Disputes Authors' Claims That Every ChatGPT Response is Derivative Work

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 119 Comments Log In/Create an Account

Comments Filter:

Nope (Score:3)

by Aighearach ( 97333 ) writes: on Wednesday August 30, 2023 @04:12PM (#63809746)

misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use)
Fair use is a defense, but you have to have otherwise violated the copyright to claim it. It is not a sound legal argument to say that fair use is outside the scope of copyright.
Lawyers are expected to always file a motion to dismiss... that they included this argument shows how weak their case is. It seems pretty obvious that they copied all these authors works, without permission. The bot can recite whole sections...

- Re: (Score:2)
  
  by Xenx ( 2211586 ) writes:
  
  Fair use is a defense, but you have to have otherwise violated the copyright to claim it. It is not a sound legal argument to say that fair use is outside the scope of copyright.
  It is fair to say that if something is an exception to a rule/law, it's outside of the scope of it. That isn't a statement as to the validity of their claim, only one of how they worded it.
  Lawyers are expected to always file a motion to dismiss... that they included this argument shows how weak their case is.
  It does nothing of the sort. You just appear to have an incomplete understanding.
  It seems pretty obvious that they copied all these authors works, without permission.
  
  They go on to show there is legal precedence for being able to use existing works, assuming they're acquired legally, to derive data from the works. "OpenAI reminded the court that "while an author may register a copyright in her book, the 's
- Re: (Score:2)
  
  by Tora ( 65882 ) writes:
  
  Does a person who's good at memorizing books need permission to recite sections of said book from memory?
  - Re: (Score:2)
    
    by Impy the Impiuos Imp ( 442658 ) writes:
    
    According to Big Brother live feed, just a few seconds is enough to get them to block the feed.
    "Stop singing", and reciting other copyrighted stuff gets the slam.
  - Re: (Score:2)
    
    by Xylantiel ( 177496 ) writes:
    
    Of course they do. Otherwise anybody could just pay someone to memorize a book one paragraph at a time and retype it and it wouldn't be copyrighted anymore. The WORK is copyrighted. Even if a human reproduces the work, it is still subject to copyright.
- Re: (Score:2)
  
  by cowdung ( 702933 ) writes:
  
  "It seems pretty obvious that they copied all these authors works, without permission. The bot can recite whole sections..."
  Ok.. I've studied transformers like GPT. Can you explain to me how they "copy" authors' works?
  - Re: (Score:2)
    
    by fluffernutter ( 1411889 ) writes:
    
    A better question is, how can chat gpt come up with anything that is based from direct real world experience and not from someone else's work?
  - Re: (Score:2)
    
    by BranMan ( 29917 ) writes:
    
    I think (and I have NOT studied transformers or GPT) that how that happens is the probabilities collapse to 100%. What I mean by that is like when I put in a super specific search phrase into Google and it comes back with ONE, and only ONE answer. (Happened to me about 6 months ago and was amazing at the time).
    GPT can only match on what it finds - if it's a subject no one anywhere writes about, and there is only one work for GPT to draw from that fits, then it only has one sequence of words to "choose fro
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  The bot can recite whole sections...
  Your ability to recite a whole section is not a copyright infringement. I can remember exactly how a Muse song sounds and can even sing the lyrics. My ability to do so is not copyright infringement.
  ACTUALLY DOING SO, would be.
  Also I'm suing you because your brain just copied this post into your memory without my permission you hypocrite!
  - Re: (Score:2)
    
    by fluffernutter ( 1411889 ) writes:
    
    If you recreated the song based on your memory then it should be infringement, however.
  - Re: (Score:2)
    
    by Aighearach ( 97333 ) writes:
    
    Your ability to recite a whole section is not a copyright infringement.
    This is the sort of stupid non-argument arguments people make on slashdot.
    No, the ability to recite whole sections is not copyright infringement. Actually doing it is. These bots don't have agency, and they don't have any code that blocks them from reciting those sections. So they do recite them.
    Why reply if you're gonna say something completely obtuse?
- - Re: (Score:2)
    
    by Growlley ( 6732614 ) writes:
    
    you can also argue the amount used was a100% They only needed to use it once and they did.
    - Re: Nope (Score:2)
      
      by beelsebob ( 529313 ) writes:
      
      Itâ(TM)s not about the amount used, itâ(TM)s about the amount copied. ChatGPT does not hold a copy of any of these works.
The recombination that happens in LLM (Score:5, Insightful)

by presidenteloco ( 659168 ) writes: on Wednesday August 30, 2023 @04:16PM (#63809754)

An LLM's output is effectively a novel recombination of micro-patterns from thousands or millions of authors, literary and otherwise.

This is strongly analogous to how a human's pondering and then utterances on a topic, during conversation, are a very complex function of many elements including the combined particular knowledge they've absorbed and abstracted and re-mixed. A good chunk of the knowledge (and fragments of ways of expressing) that the human has absorbed into their associative memory come from the many copyrighted works of literature and audio or video that they have imbibed.

So should humans be constantly accused of copyright violation for expounding on things, based on a huge combination and re-mixing of copyrighted (and uncopyrighted) works/information experiences?

If not, then why should the similarly functioning ChatGPT be accused of such violation?

- Re: The recombination that happens in LLM (Score:2)
  
  by S_Stout ( 2725099 ) writes:
  
  Yes. Your thoughts were derived from two other authors. You are going to jail for a very long time.
- Re: (Score:2)
  
  by EvilSS ( 557649 ) writes:
  
  So should humans be constantly accused of copyright violation for expounding on things, based on a huge combination and re-mixing of copyrighted (and uncopyrighted) works/information experiences?
  Don't give the author's guild any ideas. This is, after all, the same group that sued over kindle's text to speech.
- Re: (Score:2)
  
  by taustin ( 171655 ) writes:
  
  Since the first cave man painted the first picture of a bison on the wall of his cave with black soot from his fire, all art has been informed by other art. That's how it works.
  The question here is, derivative or transformative.
  Lawyer on both sides know this. All else is waving flags at the potential jury pool.
  - Re: (Score:2)
    
    by fluffernutter ( 1411889 ) writes:
    
    Not true. If an artist sits down and paints their own interpretation of a bowl of fruit that is real and that they are looking at, how is that derivitive?
- Re: (Score:2)
  
  by Ichijo ( 607641 ) writes:
  
  So should humans be constantly accused of copyright violation for expounding on things...?
  They often are.
  If not, then why should the similarly functioning ChatGPT be accused of such violation?
  Because ChatGPT can always tell you where they got the idea from, and can also be made to forget something if the original author wishes it. It would be unjust to try to force a human to do the same just because ChatGPT can do it.
  - Re: (Score:2)
    
    by xwin ( 848234 ) writes:
    
    So should humans be constantly accused of copyright violation for expounding on things...?
    They often are.
    This just shows how backward the system is. Copyright is not a natural thing, it is a legal thing so someone can extort money from someone else. It is well and good if it is done for a limited time, but the current system is broken. Copyright should end after say 10 years after registration and certainly should end upon death of the author.
    Regardless of copyright, LLM is not copying anything as the GP stated. It creates new text which is similar to the text it was trained on. If I register copyrights on p
  - Re: (Score:2)
    
    by stabiesoft ( 733417 ) writes:
    
    So curious, you suggest ChatGPT can forget some author's work. Then does ChatGPT store all the data it used for training? Then would it be true that every new copy of the data set would violate copyright since each new training set has an unlicensed copy of the original author's work? I've no idea. But it seems like only one copy of the book could be stored for each training set if the training set contains a copy of all the inputs. Unless of course they buy a copy of all the input data with each copy of th
  - Re: (Score:2)
    
    by presidenteloco ( 659168 ) writes:
    
    "Because ChatGPT can always tell you where they got the idea from" you claimed.
    
    That is incorrect. LLMs like ChatGPT store only a statistical abstraction of all of the sequences of words (from many billions of sources) that have been read into them in the neural-net training.
    - Re: (Score:2)
      
      by Ichijo ( 607641 ) writes:
      
      LLMs like ChatGPT store only a statistical abstraction of all of the sequences of words (from many billions of sources) that have been read into them in the neural-net training.
      
      That is incorrect. ChatGPT can also summarize the content of almost any book written before 2021 [medium.com].
      - Re: (Score:2)
        
        by presidenteloco ( 659168 ) writes:
        
        Summarizing is based on abstraction of the content, which is consistent with what I said.
        ChatGPT may keep around and have ability to access some or all of its source material, I don't know.
        But when it comes up with an answer to a typical prompt from a user, it is not referencing back to all of the original material (trillions of words) that it "read". Instead it is only consulting parts of its trained neural net that have had weights influenced by some of that material; whatever instances and portions of th
- Re: (Score:2)
  
  by Xylantiel ( 177496 ) writes:
  
  The comparison to an automated index or contextual search is a lot more appropriate than to a human. There are decently defined rules for when those do and do not violate copyright. At a minimum I think if one can get the LLM to reproduce a copyrighted work, then the author can receive damages. This means that the company that makes the LLM must also design it to not violate copyright with its output. Otherwise copyright is dead because every copyrighted work will just be imported into an LLM and then e
  - Re: (Score:2)
    
    by WaffleMonster ( 969671 ) writes:
    
    At a minimum I think if one can get the LLM to reproduce a copyrighted work, then the author can receive damages.
    If you ask an LLM for the lyrics to a popular theme song the response provided is a conveyance of fact not a performance.
    It's no different than a person memorizing the same theme song and reciting it when asked.
    This means that the company that makes the LLM must also design it to not violate copyright with its output. Otherwise copyright is dead because every copyrighted work will just be imported into an LLM and then everyone can just buy access to the LLM and use the appropriate prompt, like "please recite the work ... by ...". So every copyrighted work will be sold one time, to the company that makes the dominant LLM. The LLM just becomes a copyright washing machine.
    LLMs don't work this way. They may have better memories than some of us but none of them are that good.
    To give you an idea try asking an LLM something very specific but not widely known. Ask it for example to tell you the callsigns of a random cruise ship. If you dump the context window between pr
    - Re: (Score:2)
      
      by fluffernutter ( 1411889 ) writes:
      
      But that only proves further that it is derivitive, because it can't tell you about anything that it hasn't sampled. What has it sampled that isn't someone else's work?
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        But that only proves further that it is derivitive, because it can't tell you about anything that it hasn't sampled. What has it sampled that isn't someone else's work?
        LLMs are not a search index. What makes the technology useful is that generally applicable principals are being learned during training that are subsequently generally applicable within and across domains in response to prompting. The ability to apply knowledge is what sets AI apart from a search thru a database.
        If you ask it to write you a joke or story a thousand times and dump the context window between each attempt you will get a thousand jokes and stories from the same exact prompt. Perhaps some by
        
        Re: (Score:2)
        
        by fluffernutter ( 1411889 ) writes:
        
        A six sided die gives you a different answer any time. Is a ten or twenty sided die more creative?
        
        In the case of mathematics, it is not finding new mathematical theories. It may be stringing X mathematical methods together but all methods would be someone else's work. But it is not able to conclude something that no one has ever known or written, like exactly how dark matter works. It can only spit out combinations of what has been written before. It can make inferences but that is not in it self cre
- Re: (Score:2)
  
  by evanh ( 627108 ) writes:
  
  If there's money involved, yes, humans are subject to copyright violation for expounding on things.
Laws can be broken for progress (Score:2)

by S_Stout ( 2725099 ) writes:

If AI is primed to do everything for us then yes, we will break copyright to train AI. Why? Because other countries will do it and their AI will be better.

We live in a global economy, we will not stop progress.
All human students (Score:2)

by MpVpRb ( 1423381 ) writes:

...learn by studying the work of others
Using human generated work to train AI is fair use
I would never want to read a book created by AI. Only people can make creative art
Unfortunately, those who control entertainment, hate creative work and prefer sequels, reboots, remakes, spinoffs, etc. Much of what they produce might as well be created by AI
Hopefully, people will get bored and demand original, creative work by human artists
- Re:All human students (Score:5, Insightful)
  
  by ShanghaiBill ( 739463 ) writes: on Wednesday August 30, 2023 @04:34PM (#63809834)
  
  Only people can make creative art
  If that is true, then human judges of creativity should be able to easily distinguish between human art and AI art.
  Guess what? They can't.
  They'll be even less able in the future.
  
  - Re: (Score:2)
    
    by stabiesoft ( 733417 ) writes:
    
    Well in the US, AI cannot patent https://www.nature.com/article... [nature.com] and I believe copyright is also not permitted. So today, legally, only people can make creative works that are protected.
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    Guess what? They can't.
    I'm not sure what retarded judges you're talking to, but largely AI generated art is dead easy to spot. In any case the topic has nothing to do with the quality of the output and everything to do with creative expression. There's a reason e.g. midjourney images are easy to spot, they all *look the same*. They lack any kind of creativity, and above all they require creative input in order to generate any useful outcome at all.
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      I'm not sure what retarded judges you're talking to, but largely AI generated art is dead easy to spot.
      In any case the topic has nothing to do with the quality of the output and everything to do with creative expression. There's a reason e.g. midjourney images are easy to spot, they all *look the same*. They lack any kind of creativity, and above all they require creative input in order to generate any useful outcome at all.
      Literally every point made above is completely backwards.
    - Re: (Score:2)
      
      by null etc. ( 524767 ) writes:
      
      Midjourney images "look the same" because of product management decisions made by Midjourney's product engineering and marketing teams to offer a consistent, solid user experience that can reliably deliver art that meets or exceeds the needs and expectations of the product's target audience, at a price they would be willing to pay.
      Try installing Stable Diffusion locally on your computer, and you can tailor the "AI art style" to whatever you see fit. No need to make inaccurate generalizations about AI.
  - Re: (Score:2)
    
    by hebertrich ( 472331 ) writes:
    
    Care to tell us where you get the numbers ?
    Studies ?
Writing is on the wall (Score:2)

by oumuamua ( 6173784 ) writes:

Time to change how society works as AI is coming for all our jobs.
To the tired old "Capitalism promotes innovation and efficient distribution of goods and services'
AGI will do it better!
Will AI be the next bubble? (Score:5, Insightful)

by bubblyceiling ( 7940768 ) writes: on Wednesday August 30, 2023 @04:26PM (#63809804)

AI is sure starting to look like the next bubble. All the claims seem to have fallen flat on their face and now legal troubles. What comes next?

- Re: (Score:2)
  
  by ShanghaiBill ( 739463 ) writes:
  
  All the claims seem to have fallen flat on their face
  If you believe that, you're not paying attention.
  LLMs make mistakes. Sometimes hilarious mistakes. But more often than not, they are correct, and the error rate will fall rapidly with improved training and faster hardware.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  For ChatAI type AI, it has been the "next bubble" for quite some time. There are grande, massively overstated claims, and when you look, rather simplistic demos on the skill level of a beginner. There are massive, massive problems that have been demonstrated, like model poisoning, unfixable damage to models by recursion (https://arxiv.org/pdf/2305.17493v2.pdf), the impossibility to make LLMs safe (https://arxiv.org/pdf/2209.15259.pdf), ChatAI getting dumber in most/all other areas when you try to fix proble
Obviously they would (Score:2)

by gweihir ( 88907 ) writes:

Also obviously, the claim is true and can be mathematically proven. Statistical models _cannot_ be original. That is fundamentally impossible. Only deductive AI models can theoretically be original and they drown in complexity before they get there.
- Re: (Score:2)
  
  by mesterha ( 110796 ) writes:
  
  Also obviously, the claim is true and can be mathematically proven. Statistical models _cannot_ be original. That is fundamentally impossible. Only deductive AI models can theoretically be original and they drown in complexity before they get there.
  
  I could claim the opposite. There is no creativity in deductive models since all the true statements are already determined by the axioms. If you claim that the creativity is finding interesting true statements, given the complexity, then the creativity is in
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    That is stupid. "Creativity" != "creating original information". https://en.wikipedia.org/wiki/... [wikipedia.org]
    Incidentally, you just stated that either creativity is impossible or limited to sentient beings _and_ that sentience is an extra-physical phenomenon. Are you sure you wanted to do that?
    - Re: (Score:2)
      
      by fluffernutter ( 1411889 ) writes:
      
      Creativity requires experiencing the real world.
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        Creativity requires experiencing the real world.
        Would you agree someone who has never been able to move, see, feel, smell but can listen can't be creative because they've never experienced any of the things described to them in the real world?
        Is star trek creative? Afterall nobody has ever been in a starship before or gone to any strange new worlds. Without experiencing a real starship or other planets, without any relevant experience how can sci-fi be creative? If the answer is some form of extrapolation and application of learned experience the foll
        
        Re: (Score:2)
        
        by fluffernutter ( 1411889 ) writes:
        
        Well if they could only listen, then they could draw (assuming they can move) a visualization of what they think the sound that they heard looks like and it would be creative. They could draw anything as described to them and it would be creative.
        
        As for Star Trek, we witness the stars and space. We witness the laws of physics and different species on earth. We witness gravity and by 1965 knew about lack of gravity in space. The first being in orbit was 1957, the first spacewalk was 1961. Sure there w
    - Re: (Score:2)
      
      by mesterha ( 110796 ) writes:
      
      That is stupid. "Creativity" != "creating original information"
      
      So you don't think the LLMs create original information? Deduction sure doesn't do that. In Shannon's information theory, randomness plays an essential role, so if you actually defined things, you might be stuck with probability/statistics.
      Also, most LLMs provably generate original text. The output range of these LLMs is enormous. They can't help but generate original text. Of course, a monkey can easily generate original "text", so th
- Re: (Score:2)
  
  by WaffleMonster ( 969671 ) writes:
  
  Also obviously, the claim is true and can be mathematically proven. Statistical models _cannot_ be original. That is fundamentally impossible. Only deductive AI models can theoretically be original and they drown in complexity before they get there.
  It can write bedtime stories.
  PROMPT: Please write a bedtime story about a grump who shits on everything he doesn't understand. Once upon a time, there was a grumpy old man named Mr. Grumps. He lived in a small town with his cat and dog, but he wasn't very happy. You see, Mr. Grumps didn't like anything new or different. If something didn't fit into his idea of how things should be, he would get angry and start complaining. Mr. Grumps was especially grumpy about technology. He thought it was a
  - Re: (Score:2)
    
    by stabiesoft ( 733417 ) writes:
    
    It was a good little robot. The story ended well for tech. All is good.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Also obviously, the claim is true and can be mathematically proven. Statistical models _cannot_ be original. That is fundamentally impossible. Only deductive AI models can theoretically be original and they drown in complexity before they get there.
    It can write bedtime stories.
    It can do a lot of simplistic things (with low reliability), because simplistic things can be easily and statistically derived from its training data. Most people forget how _much_ training data went into these systems. These systems however cannot go beyond that training data and always fall short of what the training data would have allowed something with real intelligence to do with it. Statistical derivation of things is always incredibly shallow. There is not even one real deduction step in there.
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      It can do a lot of simplistic things (with low reliability), because simplistic things can be easily and statistically derived from its training data.
      Well the machine did manage to create an original bedtime story despite your claim it can't be original.
      Come to think of it previously you admitted to never even having tried GPT-4.
      You previously claimed "ChatGPT cannot even do a simple addition of two arbitrary numbers, the model is simply incapable of doing something like that. " which didn't age well when you were instantly proven wrong.
      Before that you said "The only impressive thing about LLMs is the language interface, not the utterly dumb "reasoning
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Well the machine did manage to create an original bedtime story despite your claim it can't be original.
        Your evaluation is flawed. This story is not original, but deeply derivative. All this shows is your lack of insight.
        
        Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        Your evaluation is flawed. This story is not original, but deeply derivative. All this shows is your lack of insight.
        What if anything is the objective basis for this claim? If it was deeply derivative and not original what was it deeply derived from? What objective criteria do you believe must be met for something to be considered original that was not met?
        Can you for example point to an original bedtime story and contrast that with the machine generated story showing how and why the definition applies to the "original" yet why the machine generated story falls short?
        Is there an objective falsifiable means of discrimina
OpenAI will win this (Score:2)

by SmaryJerry ( 2759091 ) writes:

Things like "Please output page 1 paragraph 2 of the Book ____" just won't work. The entire works are not contained within the existing system.
- Re: (Score:2)
  
  by Njovich ( 553857 ) writes:
  
  Wow, Robin Thicke should have picked you as a lawyer for his copyright case as clearly you know better than his lawyers, the judges, and the experts. Making a full copy of a work is not needed for copyright infringement. The fact is that if it can be shown that the training included a work, and that it can be triggered to output something very similar, that's a lawsuit that can go either way.
- Re: OpenAI will win this (Score:2)
  
  by maird ( 699535 ) writes:
  
  what about asking "Please provide a paragraph of text in the style of __author__ about a child's feelings toward a good father". Repeat for the style of __author2__ and an abusive father. I just tried the following successfully on ChatGPT: Write a four line poem in the style of Bob Dylan's acoustic period about the hard life of a teenage boy trying to find a job after high school? If you repeat the same four more times giving slightly different conditions for the content (finding a wife, raising a child,
  - Re: OpenAI will win this (Score:2)
    
    by maird ( 699535 ) writes:
    
    ...go on strike if you are worried by the idea of repeating the same idea but asking for TV show script lines in the style of named person when that named person is yourself and you work as a TV writer. Currently the overhead is lower to ask the writer for the lines but it will get easier to the point where you can give it text documents with multiple questions and drive the authoring from there without much real invention involved beyond the broadest definition of the scenario involved. Which suggests to m
Borrowed Books (Score:2)

by neoRUR ( 674398 ) writes:

So If I borrow a book from a friend, or read an article in the book store, or download a PDF of a book and read it. Them am I liable for what I read from that if I tell someone else, or summarize that? If I watch a TV show, say on Youtube. Then that is copyrighted material in my brain if I go an use it?
- Re: (Score:2)
  
  by fluffernutter ( 1411889 ) writes:
  
  You aren't liable for a memory, of course not. But if you write a story based on that memory without applying any real world experience of your own then it is a violation.
New laws (Score:2)

by backslashdot ( 95548 ) writes:

We need laws that clarify it is ok for AI to be trained on publicly accessible and/or purchased content. If you make it publicly accessible I should have the right to train my AI on it. When you put information out into the world you can’t expect a cut based on what someone does with that info. If I read a book on aerodynamics and build & sell airplanes I don’t owe the author of that book any money.
Have they published their evidence? (Score:2)

by turp182 ( 1020263 ) writes:

We should be able to recreate anything they found.
Reviews and comments for books were certainly ingested and they can be lengthy and contain considerable plot detail and summarization. So I would expect the LLMs to be aware of these works and be able to summarize them.
I'd like to see how they "know" a certain text was used. The devil is in the details (not the summaries).
I don't have any of these nor read them. I might try looking for Stephen King details, he has also claimed this.
It's derivative work if you are a poor... (Score:2)

by ffkom ( 3519199 ) writes:

... artist or private enthusiast using whatever machinery on copyrighted input. But if the machinery is owned by a large corporation, then of course it's all fine. That's why Google won similar lawsuits before...
All works are derivative. (Score:2)

by Petersko ( 564140 ) writes:

Would be interesting if one of those authors ever tried to copyright an "original" blues song.
Yes... I know. "All works are derivative" is annoying click-bait. The statement is both completely true and entirely useless.
- Re: (Score:2)
  
  by xwin ( 848234 ) writes:
  
  The only non derivative work would be if an author learned the alphabet and memorized 100K words from a dictionary and then wrote a piece of literature or science. Everything else is derivative work.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
The Life Form Defense (Score:3)

by ErikKnepfler ( 4242189 ) writes: on Wednesday August 30, 2023 @08:00PM (#63810594)

more fun would be to claim that ChatGPT is a life form and then simply argue that it's being taught like a child reading from the same books, just faster

- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  But it would be easy to prove false given that ChatGPT is unable to understand even the most basic concepts. People need to realise that fundamentally repeating something, mixing something, and understanding the context of something is not the same thing.
Error (Score:2)

by ElizabethGreene ( 1185405 ) writes:

I don't see how it can be a derivative work. Derivative works are copyrighted works based on something else. The OpenAI responses were created by an AI which cannot be the author of a copyrighted work so they can't be derivative works.
- Re: (Score:2)
  
  by fluffernutter ( 1411889 ) writes:
  
  What does chatgpt do that isn't based on something else? It's just a calculation of a million things someone else did. It cannot interject anything it has learned from the real world because it cannot sense the real world.
Standing in the shoulders of giants (Score:2)

by Morky ( 577776 ) writes:

All human creativity is derivative to a large extent.
- Re: (Score:3)
  
  by Retired Chemist ( 5039029 ) writes:
  
  There is a difference between being influenced by someone and copying someone. If you copy a large number of people and make a mix, it is still a copy. This has already been litigated when humans were doing the combining (for example, DJs doing sampling). In the example above, a musician, who had no external influences would also have no training and would probably not create anything we would recognize as music. There is a difference between making something in the style of someone and copying their w
  - Re:Sawing the branch you're sitting on (Score:5, Insightful)
    
    by taustin ( 171655 ) writes: on Wednesday August 30, 2023 @04:35PM (#63809842) Homepage Journal
    
    Not really, The legal issue is whether the output of the chatbot is derivative - which requires permission from the original copyright holder to distribute - or transformative - which does not.
    Google's book scanning project face the exact same claims in multiple lawsuits, and won on the basis that a full text searchable index , is a new kind of thing, and thus, transformative.
    This is really the only legal argument that matters. If it's transformative, it doesn't matter who wrote the works used to train, or how they're used in train, only that it's transformative.
    Both sides have coherent arguments in this, but Authors Guild, Inc. v. Google, Inc. is going to be pretty relevant.
    
    - Re: (Score:2)
      
      by tlhIngan ( 30335 ) writes:
      
      Not really, The legal issue is whether the output of the chatbot is derivative - which requires permission from the original copyright holder to distribute - or transformative - which does not.
      Google's book scanning project face the exact same claims in multiple lawsuits, and won on the basis that a full text searchable index , is a new kind of thing, and thus, transformative.
      This is really the only legal argument that matters. If it's transformative, it doesn't matter who wrote the works used to train, or
      - Re: (Score:2)
        
        by taustin ( 171655 ) writes:
        
        I have no idea how this will turn out when it gets to SCOTUS, only that it will, and that the Google case will figure prominently is at least one side's arguments.
        And after that, Congress may well alter the law to suit the desires of whoever owns them that week.
      - Re: (Score:2)
        
        by fluffernutter ( 1411889 ) writes:
        
        Computers sense nothing from the world on their own. It is completely impossible for chatgpt to come up with something that is not based on a calculation from something someone put on the internet. Therefore everything is derivitive.
    - Re: (Score:2)
      
      by msimm ( 580077 ) writes:
      
      All work is derivative.
      - Re: (Score:2)
        
        by fluffernutter ( 1411889 ) writes:
        
        Not true. Van gogh took people/things he experienced in real life and painted them how he saw them. The fact that the transformation game from his life experiences and his manual dexterity as an artist makes them transformative. Chatgpt has no life experiences, it doesn't even think. It is literally impossible for it to come up with anything other than a calculation based on other people's works. Therefore it is always derivitive by definition.
        
        Re: (Score:2)
        
        by taustin ( 171655 ) writes:
        
        Ban Gogh didn't invent the basic techniques, nor technology, of painting pictures. Improved, undoubtedly, but he started with what others had done before, just like all artists.
        
        Re: (Score:2)
        
        by fluffernutter ( 1411889 ) writes:
        
        Every artist has their own technique.of brush strokes, that's precisely what makes his work distinguishable. No he didn't invent paint or paint brushes but he very much had his own way to use them. Clearly you are not a painter or know much about art. Even the texture of the paint is important and different with all painters and developed with experience. The fact that he had to invent the very act of painting in order to be creative is ridiculous.
    - Re: (Score:2)
      
      by myowntrueself ( 607117 ) writes:
      
      Not really, The legal issue is whether the output of the chatbot is derivative - which requires permission from the original copyright holder to distribute - or transformative - which does not.
      Google's book scanning project face the exact same claims in multiple lawsuits, and won on the basis that a full text searchable index , is a new kind of thing, and thus, transformative.
      This is really the only legal argument that matters. If it's transformative, it doesn't matter who wrote the works used to train, or how they're used in train, only that it's transformative.
      Both sides have coherent arguments in this, but Authors Guild, Inc. v. Google, Inc. is going to be pretty relevant.
      Would a raw statistical analysis of the text be derivative?
  - Re: (Score:2)
    
    by xwin ( 848234 ) writes:
    
    You need to go and study how LLMs work. Whatever LLM produces is not a copy of something, it is a set of words stringed together based on probability with which they occur in the material used for training LLM. So in effect the output is amalgamation of all the training data and not a copy of any single work.
    If you say that LLM infringing copyrights of someone, then every math student is infringing copyrights of the authors of math textbook , used to teach said student.
    The current copyright system goes ag
  - Re: (Score:2)
    
    by myowntrueself ( 607117 ) writes:
    
    There is a difference between being influenced by someone and copying someone. If you copy a large number of people and make a mix, it is still a copy. This has already been litigated when humans were doing the combining (for example, DJs doing sampling).
    In the example above, a musician, who had no external influences would also have no training and would probably not create anything we would recognize as music. There is a difference between making something in the style of someone and copying their work.
    What a LLM produces when you give it a prompt isn't a *reproduction* of *samples* of the works used to train it.
    If I say to ChatGPT: "Review Star Trek, The Next Generation in the style of Gordon Ramsay", the result is neither a derivation of the works of Gene Roddenberry nor of Gordon Ramsay. It isn't a sampling of them, it doesn't even contain samplings. Its based on statistics, no different to the predictive corrections on your phone keyboard.
    Oh, bloody hell! What on Earth did they think they were doing with Star Trek: The Next Generation? It's like they took a classic recipe and decided to jazz it up with all sorts of strange ingredients. Let's break it down, shall we?
    First off, the characters. You've got Captain Jean-Luc Picard, who's meant to be the leader of the pack, but half the time he's dishing out speeches that are drier than an overcooked turkey. And what's with that android, Data? It's like they grabbed a bucket of emotionless nonsense and poured it all over him. No seasoning, no depth, just blandness.
    And the plots, my goodness! They're all over the place, like a dog's breakfast. One episode, they're dealing with time travel, the next they're stuck in a holodeck fantasy. It's like they couldn't decide if they were making a space opera or a soap opera. Pick a lane, for heaven's sake!
    Now, let's talk about the special effects. Some of those space battles look like they were cooked up by a toddler with a box of crayons. I've seen better effects on a flipping cereal box. And don't get me started on those aliens. Half the time, they look like they raided a Halloween store for their costumes.
    But you know what really takes the cake? The inconsistency. One episode, they're churning out Michelin-star quality storytelling, and the next, they're serving up a half-baked mess that leaves you scratching your head. It's like they had a team of chefs with wildly different skills and couldn't figure out how to make a cohesive menu.
    Look, I'm all for trying new things, pushing boundaries, and experimenting in the kitchen. But when you mess with a classic recipe like Star Trek, you better make sure you're adding the right ingredients in the right proportions. Unfortunately, The Next Generation feels like a recipe that's gone way off course. It's a bit like finding a soggy bottom on your soufflé – disappointing and just not up to par.
- Re: (Score:2)
  
  by taustin ( 171655 ) writes:
  
  If this goes through and accepted as an argument, copyright as a concept is dead.
  Funny, it's been a decade since Authors Guild, Inc. v. Google, Inc. [wikipedia.org] was decided over Google scanning millions of books, using the same defense, transformative vs derivative, and, despite the same predictions of hellfire raining down from the heavens, dogs and cats living together, and, yes, the death of copyright as a concept, despite all that, Google won at trial, and the publishing industry and copyright seem to be doing just fine.
  - Re: (Score:3)
    
    by Luckyo ( 1726890 ) writes:
    
    You're missing the point entirely. That was irrelevant to this, because there the argument was "is this novel enough?"
    Here, the question is much more fundamental: "are you allowed to learn from material without paying the copyright holder?" Because AI learns from material. It doesn't actually copy it. And that's the argument, that the output of the learning process falls under the copyright, because you shouldn't be allowed to learn from material and then generate your own creative output based on that lear
    - Re: (Score:2)
      
      by taustin ( 171655 ) writes:
      
      You're missing the point entirely.
      That is the only point.
      That was irrelevant to this, because there the argument was "is this novel enough?"
      Here, the question is much more fundamental: "are you allowed to learn from material without paying the copyright holder?"
      What part of Title 17 prohibits it? And, by necessity, prohibits library patrons from learning from material they haven't paid the copyright holder for, or students who haven't paid the copyright holders for the textbooks they learn from?
      (And there have been lawsuits over libraries loaning out books. That was settled over a century ago.)
      Somebody paid for the material used in training, at some point. After that, the right of first sale comes into play.
      The only legal issue is whether Ch
      - Re: (Score:2)
        
        by Luckyo ( 1726890 ) writes:
        
        I would like to remind you that ban on abortion came from "privacy" protections. Your laser focus on "what's in the text" misses the fact that judges are the ones who get to interpret the text, and they can choose to interpret it in very creative and clearly incorrect ways.
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      AI can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works, and even the basic orders given to the AI by its developers.
      The copyright holders argue that if someone makes an AI they are responsible for its output, and if it copies big sections of their work then that's copyright infringement. The same as if a human reproduced copyrighted work from memory and sold it as a service, they would be responsible.
      The other issue is if AI learning is the same as
      - Re: (Score:2)
        
        by Luckyo ( 1726890 ) writes:
        
        >AI can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works...
        Wacom tablet can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works...
        Paint brush can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works...
        [Musical instrument] can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works...
        You
        
        Re: (Score:2)
        
        by AmiMoJo ( 196126 ) writes:
        
        The difference is a paintbrush is not being sold as a tool to paint your artwork for you.
        The legal argument is that AI is a service provided by the developer, not a tool. The company offers a writing service, just done by AI instead of by human authors. Therefore the same copyright rules that limit how human authors can use the source material applies, to the company providing the service.
        
        Re: (Score:2)
        
        by Luckyo ( 1726890 ) writes:
        
        But Wacom tablets are. And so are electronic keyboards. And logic of a paintbrush is the same. It paints for you, instead of having to do it with your fingers.
        The legal argument is that AI is just another tool that requires human input and generates output based on said input. You're far too used to strawmanning arguments, and it shows even here.
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        AI can be induced to reproduce portions of texts it has read. People have used that to reproduce copyrighted works, and even the basic orders given to the AI by its developers.
        Instructions are a different thing they typically come from context window rather than the model itself. Often you can simply ask the AI to repeat what was said earlier from a brand new session and it'll reveal what was injected by the vendor into the context. I hear some of the vendor instructions to models are fairly insulting and derogatory towards humans.
- Re: (Score:2, Troll)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by Luckyo ( 1726890 ) writes:
    
    As I argue above here: https://slashdot.org/comments.... [slashdot.org]
    It is actually worse than that. I'm not sure if copyright holders suing understand it or not, but they're de jure targeting "ability to learn from copyrighted content" and attempting to establish a precedent that you must have a permission from copyright holder to learn from any copyrighted material and produce any output based on it.
    This would effectively shatter the concept of copyright as it exists, and reverse the intended purpose of copyright when
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  ChatGPT is not an artist. It's a thing. It's not even a living thing. It has no rights, and whether its "work" derives from those ("who?") that came before is irrelevant.
  The only real question is whether ChatGPT's owners have broken the law by illegally copying works without permission (namely, by downloading the works onto some storage devices, temporarily or not).
  - Re: (Score:2)
    
    by Luckyo ( 1726890 ) writes:
    
    Wacom tablet is not an artist. It's a thing. It's not even a living thing. It has no rights, and whether its "work" derives from those ("who?") that came before is irrelevant.
    Painter's brush is not an artist. It's a thing. It's not even a living thing. It has no rights, and whether its "work" derives from those ("who?") that came before is irrelevant.
    [musical instrument] is not an artist. It's a thing. It's not even a living thing. It has no rights, and whether its "work" derives from those ("who?") that ca
    - Re: (Score:2)
      
      by martin-boundary ( 547041 ) writes:
      
      Your analogies are wasted I'm afraid. Nobody cares if these things are tools, certainly not the law. As always it's human beings breaking the law or not, and the tools are a means to such an end. If humans break the law by doing something, using tools, then they broke the law and are responsible. If humans didn't even realize they were breaking the law because this was implicit in the tool use (like training an AI model), then they still broke the law and are responsible. Also know as "ignorance of the law
- - Re: (Score:2)
    
    by Luckyo ( 1726890 ) writes:
    
    That's not how legal systems based on precedent work. Not even a little bit.
- Re: (Score:2)
  
  by fluffernutter ( 1411889 ) writes:
  
  You have real world experience. It is not correct to say that person's work is the basis from all your knowledge, because there will always also be knowledge of the real world.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Nope (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Nope (Score:2)

The recombination that happens in LLM (Score:5, Insightful)

Re: The recombination that happens in LLM (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Laws can be broken for progress (Score:2)

All human students (Score:2)

Re:All human students (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Writing is on the wall (Score:2)

Will AI be the next bubble? (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Obviously they would (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

OpenAI will win this (Score:2)

Re: (Score:2)

Re: OpenAI will win this (Score:2)

Re: OpenAI will win this (Score:2)

Borrowed Books (Score:2)

Re: (Score:2)

New laws (Score:2)

Have they published their evidence? (Score:2)

It's derivative work if you are a poor... (Score:2)

All works are derivative. (Score:2)

Re: (Score:2)

Re: (Score:2)

The Life Form Defense (Score:3)

Re: (Score:2)

Error (Score:2)

Re: (Score:2)

Standing in the shoulders of giants (Score:2)

Re: (Score:3)

Re:Sawing the branch you're sitting on (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)