US Copyright Office to AI Companies: Fair Use Isn't 'Commercial Use of Vast Troves of Copyrighted Works' (yahoo.com) 214

Posted by EditorDavid on Monday May 12, 2025 @03:34AM from the breaking-training dept.

Business Insider tells the story in three bullet points:

- Big Tech companies depend on content made by others to train their AI models.

- Some of those creators say using their work to train AI is copyright infringement.

- The U.S. Copyright Office just published a report that indicates it may agree.

The office released on Friday its latest in a series of reports exploring copyright laws and artificial intelligence. The report addresses whether the copyrighted content AI companies use to train their AI models qualifies under the fair use doctrine. AI companies are probably not going to like what they read...

AI execs argue they haven't violated copyright laws because the training falls under fair use. According to the U.S. Copyright Office's new report, however, it's not that simple. "Although it is not possible to prejudge the result in any particular case, precedent supports the following general observations," the office said. "Various uses of copyrighted works in AI training are likely to be transformative. The extent to which they are fair, however, will depend on what works were used, from what source, for what purpose, and with what controls on the outputs — all of which can affect the market."

The office made a distinction between AI models for research and commercial AI models. "When a model is deployed for purposes such as analysis or research — the types of uses that are critical to international competitiveness — the outputs are unlikely to substitute for expressive works used in training," the office said. "But making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries."
The report says outputs "substantially similar to copyrighted works in the dataset" are less likely to be considered transformative than when the purpose "is to deploy it for research, or in a closed system that constrains it to a non-substitutive task."

Business Insider adds that "A day after the office released the report, President Donald Trump fired its director, Shira Perlmutter, a spokesperson told Business Insider."

US Copyright Office to AI Companies: Fair Use Isn't 'Commercial Use of Vast Troves of Copyrighted Works'

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 214 Comments Log In/Create an Account

Comments Filter:

I cannot see this stopping the AI spiders (Score:4, Insightful)

by Alain Williams ( 2972 ) writes: <addw@phcomp.co.uk> on Monday May 12, 2025 @03:44AM (#65369781) Homepage

Effective enforcement will not be easy.
The other problem that the spiders do is to overload servers. They do not seem to be gentle in the way that most search engine spiders are.

- Re:I cannot see this stopping the AI spiders (Score:5, Interesting)
  
  by buck-yar ( 164658 ) writes: on Monday May 12, 2025 @05:28AM (#65369907)
  
  Sure it is. Encourage whistleblowers to come forward. "I did this" from a software engineer in court. There might even be hard evidence in communications, as some of these employees knew what they were doing was wrong and raised objections.
  Penalties, per the FBI warning on home VHS tapes, 5 years, $250,000 fine, and felony so you lose your voting and gun rights for life. If its per violation, some AI execs might be looking at hundreds of years in prison and fines exceeding the value of these large tech companies market capital.
  
  - Re:I cannot see this stopping the AI spiders (Score:5, Insightful)
    
    by gweihir ( 88907 ) writes: on Monday May 12, 2025 @06:31AM (#65369951)
    
    Indeed. Also, under some conditions, LLMs can be made to regurgitate parts of their training data....
    The second thing is, you must delete all the stolen content. And that means the whole LLM in this case.
    
    - - Re:I cannot see this stopping the AI spiders (Score:5, Informative)
        
        by gweihir ( 88907 ) writes: on Monday May 12, 2025 @10:04AM (#65370469)
        
        So? Remember where their markets are. Incidentally, you do not understand how copyright works. How pathetic.
        
        
        Re: (Score:2)
        
        by jonsmirl ( 114798 ) writes:
        
        The AI industry is a 1000lb gorilla and the copyright crowd is a mouse. There is no doubt in how this is going to end up.
        
        Re: (Score:3)
        
        by postbigbang ( 761081 ) writes:
        
        This is kleptocracy incarnate, the thought that what is mine is yours, no compensation, no barter, the robbery and theft of value.
        The size of the gorilla doesn't matter. Remember what a mouse does to an elephant.
        
        Re: (Score:2)
        
        by larryjoe ( 135075 ) writes:
        
        This is kleptocracy incarnate, the thought that what is mine is yours, no compensation, no barter, the robbery and theft of value.
        The size of the gorilla doesn't matter. Remember what a mouse does to an elephant.
        It's interesting that AI spidering and movie/music downloads are viewed differently in terms of IP protection. Especially before the advent of "affordable" music streaming, music file sharing was rampant, and many young/tech people seemed to view that as okay and just a reaction to the music robber barons. Movie/TV piracy is still ongoing but becoming less a thing due to movie streaming becoming more affordable (relative to a decade ago).
        It'll be interesting to see how AI spidering and data gathering evol
        
        Re: (Score:2)
        
        by postbigbang ( 761081 ) writes:
        
        You say that, but look at the chicanery of how Spotify robs musicians, and cribs their infrastructure to further dilute royalties with noise.
        Two can play that game, and it's already happening as data pools are poisoned with AI blather to fight, that's right, AI crawler scraping.
        
        Re: (Score:2)
        
        by Alain Williams ( 2972 ) writes:
        
        AI companies have the cash and political connections to get what they want. The push back has already started:
        The head of the US Copyright Office has reportedly been fired, the day after agency concluded that builders of AI models use of copyrighted material went beyond existing doctrines of fair use. [theregister.com]
  - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    Suchir tried that.
  - Re: (Score:2)
    
    by strikethree ( 811449 ) writes:
    
    It takes one million dollars to escape those charges. Do you really think the AI companies will be fearful of any "laws"?
- Re:I cannot see this stopping the AI spiders (Score:5, Interesting)
  
  by AmiMoJo ( 196126 ) writes: on Monday May 12, 2025 @07:25AM (#65370029) Homepage Journal
  
  Effective enforcement is easy. Just make some severe penalties for doing it, and crucially for using AI that has been trained on unlicensed material. Then any business that wants to sell AI services will need to certify that they trained it legally, because their customers will demand it for fear of being hit by penalties themselves.
  The same rules will apply to foreign made AIs of course.
  The EU does that and it has proven successful with things like GDPR.
  
- Re: I cannot see this stopping the AI spiders (Score:2)
  
  by simlox ( 6576120 ) writes:
  
  And it certainly won't stop Chinese spiders... So even though it seems right and a win for content providers to stop the spiders, it will just give China a huge advantage.
  - Re: I cannot see this stopping the AI spiders (Score:2)
    
    by EldoranDark ( 10182303 ) writes:
    
    This would not stop western companies from using copyrighted material for research, so development can carry on. At the same time we can require that any Chinese models used in the west adhere to the same standards or gtfo.
- Re:I cannot see this stopping the AI spiders (Score:5, Interesting)
  
  by mysidia ( 191772 ) writes: on Monday May 12, 2025 @11:06AM (#65370679)
  
  The other problem that the spiders do is to overload servers. They do not seem to be gentle
  There is also a technical issue that calls for a technical solution IMO.
  Most web servers are based off open source programs such as Apache or Nginx. My suggestion is that those projects should develop some mitigation against automated crawlers and spiders, and include a Default configuration that spanks them hard while still allowing well-known crawlers within reasonable limits.
  Essentially peoples' web service daemon should have functionality added to identify and classify crawlers and Penalize or Block those crawlers or IP addresses who are identified as having acted in certain shady manners, including:
  1. Crawlers who performed an excessive number of requests per second or per minute.
  2. Crawlers who fail to maintain a stable distinctive User-Agent string. Especially any crawlers spoofing a standard Browser UA string. Also Crawler IPs known to have taken on a different major provider's UA string such as a non-Google robot using a Googlebot UA.
  3. Crawlers who failed to request a robots.txt Or who disregarded a robots.txt by crawling a directory not listed as allowed in it (if present) or by crawling a directory listed as Disallowed.
  I'm suggesting that standard web server software get modules added to Detect these cases AND share IP address and User-Agent information with centralized trackers of crawlers that can be locally classified as violative.
  Finally, that they add functionality where certain DNSBL and shared Classification repositories can be used to automatically block various IP Addresses and UA strings detected by others as being nefarious or unruly crawlers.
  
Repeat after me (Score:5, Insightful)

by blahabl ( 7651114 ) writes: on Monday May 12, 2025 @03:45AM (#65369783)

Copyright is not a natural right. It is a privilege given to content creators for the benefit of mankind not for the benefit of content creators. And to anyone saying "copyright does not allow use by AI" an answer of "well, maybe it should" is very valid.

- Re: Repeat after me (Score:5, Insightful)
  
  by reanjr ( 588767 ) writes: on Monday May 12, 2025 @05:15AM (#65369893) Homepage
  
  It may be a "valid" response, but it's hardly compelling. Your argument is that copyright should be completely upended and author protections should simply vanish into an LLM. Yo make that argument, you're going to have to start at first principles to explain why compensating artists is no longer beneficial to mankind.
  
  - Re: Repeat after me (Score:3)
    
    by St.Creed ( 853824 ) writes:
    
    A moderate amount of compensation and protection wouldn't be bad. However, the current "annuity for my great-great-grandchildren" is anything but. It is the product of lobbying by the biggest corporations, does nothing for authors and artists, and only ensures rentseeking is profitable while stifling innovation and new arts. Sampling in hip hop being a case in point, but there are hundreds of examples if you search a bit.
    Rent seeking is a problem because it eventually paralizes any economy. And it is expeci
  - Re: (Score:3)
    
    by SteelCamel ( 7612342 ) writes:
    
    It's not "completely upended". It's currently perfectly legal for a human to read all the copyrighted works they want, learn from them, and produce their own works based on what they've read. It's not immediately obvious that a machine doing the same thing is contrary to the principles of copyright. Of course in either case if the output is too closely based on a small set of works then it's an infringement, but that doesn't mean that the training itself is.
    That's not so say that there isn't an issue of cou
    - Re: (Score:2)
      
      by omnichad ( 1198475 ) writes:
      
      in either case if the output is too closely based on a small set of works then it's an infringement,
      These bots are getting better at guardrails for speech and not saying things that are factually false. Like if you ask a newer LLM how many Rs in strawberry, it will produce a convincing but wrong answer but then correct itself mid-output.
      Like a human, there should be "self-awareness" when there are violations and avoidance should be programmed in. But also like a human, a human employee that does the work for hire and unintentionally copies something - the hiring company is allowed to be sued rather than
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    Did "author protections" get "upended" when you read a book? How is it different when it's an LLM doing the reading?
    Your response is neither valid nor compelling.
    "..you're going to have to start at first principles to explain why compensating artists is no longer beneficial to mankind."
    No, all you need to do is point out the hypocrisy of your position. You believe the benefit is for you, not for them. It's pulling up the ladder, nothing more.
    - Re: (Score:2)
      
      by jakimfett ( 2629943 ) writes:
      
      An LLM is different than an individual because said individual cannot read entire libraries in minutes, and cannot programmatically regurgitate those works en-masse.
      
      Starting from first principles here is good, it shows how a well-read author is distinct from wholesale monetized copyright infringement.
      - Re: (Score:2)
        
        by VaccinesCauseAdults ( 7114361 ) writes:
        
        Great examples. Here’s another: the individual doesn’t have hundreds of millions of users issuing billions of queries like “write me a short story in the style of Author X because I want to read one but don’t want to buy a copy.”
        Or “write me a summary of Topic T using all the info you learned from Published Books X Y and Z, because I don’t want to buy a copy.”
        
        Re: (Score:2)
        
        by greytree ( 7124971 ) writes:
        
        Or “write me a thesis of the current thought around Theory T by summarizing all the info you learned from Published Theses X, Y and Z, because I don’t want to buy a copy.”
        
        Which is how science works.
  - Re: (Score:2)
    
    by strikethree ( 811449 ) writes:
    
    I had a thought while reading what you wrote:
    you're going to have to start at first principles to explain why compensating artists is no longer beneficial to mankind.
    The thought was this: Why do artists think they deserve compensation? If someone had asked for the art, then the person who asked should have paid or why would the artist do such a thing and expect money?
    I only responded because you said, "let's go back to First Principles"
    - Re: (Score:2)
      
      by greytree ( 7124971 ) writes:
      
      "Copy" is the word you need to include in your thoughts
- Re: Repeat after me (Score:5, Insightful)
  
  by PseudoThink ( 576121 ) writes: on Monday May 12, 2025 @05:20AM (#65369901)
  
  Agreed. It is a valid discussion and reducing it to a black and white generalization is absurd.
  A complete win for Western content creators would likely leave AI development and advancement crippled compared to countries where it is unfettered. Our content creators can sip their kombuchas while foreign AI dominates the future.
  A complete win for AI companies would likely result in continued, flagrant abuse of created content for profit in a manner which competes with the content creators. Doesn't seem right, either.
  
  - Re: (Score:2)
    
    by tlhIngan ( 30335 ) writes:
    
    Agreed. It is a valid discussion and reducing it to a black and white generalization is absurd.
    A complete win for Western content creators would likely leave AI development and advancement crippled compared to countries where it is unfettered. Our content creators can sip their kombuchas while foreign AI dominates the future.
    A complete win for AI companies would likely result in continued, flagrant abuse of created content for profit in a manner which competes with the content creators. Doesn't seem right,
    - Re: (Score:2)
      
      by omnichad ( 1198475 ) writes:
      
      So instead of life+70, the copyright term can be reduced to something reasonable
      
      I think that I would accept life+70 or 30 years from the point that a work is commercially successful, whichever is smaller. It's complex, but reasonable. However, with strengthened trademark laws regarding use of public domain works.
      This would prevent someone else from profiting off your work before you can. If you suddenly find fame and success after decades of hard work, it wouldn't be right before the clock runs out.
      So Disney would still be the primary Mickey owner even if someone else can give away o
- Re:Repeat after me (Score:4, Interesting)
  
  by Entrope ( 68843 ) writes: on Monday May 12, 2025 @06:13AM (#65369931) Homepage
  
  That's not an argument, that's just emoting.
  How is that different in substance from "maybe copyright infringement should be allowed because I don't want to pay for a newspaper subscription. that would benefit mankind, i.e. me."?
  
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    It's different because the argument is that you get access to the articles for free, but the AI company doesn't. Straw manning the argument is not a win, data being scraped for training isn't behind a "newspaper subscription", and if it were, AI use of the articles should be fine, by your standard, if the company paid the subscription.
    - Re: (Score:2)
      
      by Entrope ( 68843 ) writes:
      
      Copyright laws and licenses govern more than just what somebody pays for the copyrighted work. Your argument is based on ignoring violations of licenses, all the secondary copying that goes on during training, and the propensity of AI models to regurgitate training material.
- Re:Repeat after me (Score:4, Insightful)
  
  by gweihir ( 88907 ) writes: on Monday May 12, 2025 @06:28AM (#65369947)
  
  Well, maybe it should if the resulting models are available and also under fair-use. Most are not, hence criminal commercial copyright infringement.
  
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    You think "fair use" is a license? LOL
    "Most are not" LOLOL says who? You think AI publishers get to decide whether you get "fair use".
    You never fail to impress with the stupid.
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      How pathetic. You do not even have basic reading comprehension. Dumb and aggressive. Nice!
- Re: (Score:3)
  
  by msauve ( 701917 ) writes:
  
  I see your argument. Ownership of any sort of property is not a natural right. So someone with a bigger gun taking your stuff is very valid.
  - Re: (Score:2)
    
    by znrt ( 2424692 ) writes:
    
    there are no natural rights, someone with a gun can make a very valid point about your right to life too.
- Copyright is being abused (Score:3)
  
  by Raisey-raison ( 850922 ) writes:
  
  Copyright is not a natural right. It is a privilege given to content creators for the benefit of mankind not for the benefit of content creators. And to anyone saying "copyright does not allow use by AI" an answer of "well, maybe it should" is very valid.
  Copyright is being abused and it's no longer about encouraging innovation. If concentrates wealth and is a key driver of inequality. It has been extended and extended. It went from a reasonable 7 years to 14 years renewable to 28, then life of the author plus 50 to life plus 70 years.
  It was extended to buildings which is completely ridiculous. They industry tried but failed to extend it to clothing. There is absolutely no reason why an AI shouldn't read a work because it doesn't compete by selling a copy of
- Re: (Score:3)
  
  by IDemand2HaveSumBooze ( 9493913 ) writes:
  
  Copyright is not a natural right. It is a privilege given to content creators for the benefit of mankind not for the benefit of content creators
  Absolutely agree. Copyright's purpose is to give creative people with good ideas an incentive to realise or at least flesh out those ideas. It's purpose is not to give people money for nothing - but of course people will never stop trying to find something that gives them exactly that.
  What I very much dispute, however, is that so-called AI is a benefit to mankind. As I see it's it's not only not beneficial but very dangerous. It has potential to bring great harm to mankind
  Let's start by assuming that just t
- Re: (Score:2)
  
  by ihadafivedigituid ( 8391795 ) writes:
  
  I'm out of mod points, so I will just have to applaud this and link, once again, Thomas Babington Macaulay's 1841 speech to the House of Commons on this subject:
  
  https://www.thepublicdomain.org/2014/07/24/macaulay-on-copyright/ [thepublicdomain.org]
- - Re: Repeat after me (Score:2)
    
    by Kelxin ( 3417093 ) writes:
    
    Agreed. The entire moderation system on this platform is screwed.
    - Re: Repeat after me (Score:2)
      
      by drinkypoo ( 153816 ) writes:
      
      This place is generally screwed. It doesn't work, it isn't well managed, you can't block posters, it allows anonymous posts and you can't block those either. Broken by design.
      - Re: Repeat after me (Score:5, Informative)
        
        by Zak3056 ( 69287 ) writes: on Monday May 12, 2025 @08:37AM (#65370211) Journal
        
        you can't block posters, it allows anonymous posts and you can't block those either.
        I mean, you can't "block" anything, but you effectively can. Change your comment modifiers [slashdot.org] and set ACs to -6, set "foes" to -6, mark commenters you don't like as foes, and browse at 0 or above. You will no longer see AC comments or comments from people you don't like.
        
        
        Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        I mean, you can't "block" anything, but you effectively can.
        You can't but you can? Take two steps forward, and two steps back, and then two steps forward, and two steps back, and now we're doing the cha-cha.
        I want their content to disappear as if it never was, whether it was modded up or not. That would be blocking, unlike what you propose.
        
        Re: (Score:3)
        
        by Zak3056 ( 69287 ) writes:
        
        Take two steps forward, and two steps back, and then two steps forward, and two steps back, and now we're doing the cha-cha.
        Thank you, Chris Knight.
        I want their content to disappear as if it never was, whether it was modded up or not. That would be blocking, unlike what you propose.
        Yes, I understand--that is not something you can do. You can, however, effectively achieve this by following the instructions given.
        You know, the adverb form of "effect." Specifically, definition 2, "in effect; virtually" which rests upon definition 4 of the root word, "the power to bring about a result." So, to answer your snarky question, yes, in effect you can't but you can.
        Is your brain not firing on all cylinders this morning, or are you just purposefully being obtuse?
        
        Re: Repeat after me (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        "Yes, I understand--that is not something you can do. "
        Great, so stop giving me inadequate suggestions as if I don't already know how this site "works". Your condescension is unwarranted.
        
        Re: (Score:2)
        
        by Zak3056 ( 69287 ) writes:
        
        Ah, purposefully being obtuse it is. And an asshole, too. Yay, you win the internet today!
        
        Re: (Score:2)
        
        by pjt33 ( 739471 ) writes:
        
        Take two steps forward, and two steps back, and then two steps forward, and two steps back, and now we're doing the cha-cha.
        I don't know what you're doing, but it's definitely not the cha-cha. That's step, replace, chassé, except for a few figures which chain chassés.
  - Re:Repeat after me (Score:5, Insightful)
    
    by msauve ( 701917 ) writes: on Monday May 12, 2025 @07:32AM (#65370039)
    
    >A well-made factual point.
    
    It is not factual. In the US, copyright is meant to benefit both the author and mankind: "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries."
    
    It (deliberately?) ignored that in order to encourage authorship and still benefit society, copyrights are granted for limited terms, after which the works become public domain. Let the AIs freely train on all the pre-1930 works they want. Whether the 95 year term is appropriate is another matter.
    
    - Re: (Score:3)
      
      by greytree ( 7124971 ) writes:
      
      You misread it.
      
      It parses as "To promote A, by securing (for limited times) to B thing C"
      where
      A = The Progress of Science and useful Arts
      B = Authors and Inventors
      C = the exclusive Right to their respective Writings and Discoveries
      
      i.e. The law is to promote A.
      It does that by securing C for B.
      
      i.e. The law is to promote Progress of Science and useful Arts.
      It does that by securing the exclusive Right to their respective Writings and Discoveries for Authors and Invento
      - Re: (Score:2)
        
        by msauve ( 701917 ) writes:
        
        LOL. You're arguing that granting limited exclusive rights is "not for the benefit of content creators." You fail.
        
        Re: (Score:2)
        
        by greytree ( 7124971 ) writes:
        
        No, I am pointing out that that is not the purpose of the law. You fail.
        
        <sigh>
        
        Re: (Score:2)
        
        by msauve ( 701917 ) writes:
        
        You're a Sophist, I see.
        
        Re: (Score:2)
        
        by omnichad ( 1198475 ) writes:
        
        Without the law, copying everything is legal. Therefore there is no incentive to create because you won't make enough money to fund its creation. The purpose of the law is the trade exclusivity now for explicit public domain rights later. You promote A by making it financially viable.
        
        Re: (Score:2)
        
        by greytree ( 7124971 ) writes:
        
        Yes.
        Which is why the OP wrote "It is a privilege given to content creators for the benefit of mankind not for the benefit of content creators."
        
        And morons "correcting" him get more mod points than him because there are more fucking stupid mods than mods who can actually comprehend basic fucking English.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    If objective fairness were the policy you would have been gone long ago.
    - Re: (Score:2)
      
      by greytree ( 7124971 ) writes:
      
      I have no idea who the fuck you are.
      
      IOW Fuck off.
- - Re: Repeat after me (Score:2)
    
    by St.Creed ( 853824 ) writes:
    
    That statement isn't supported by the history of copyright. It would have been easier to not have copyright, if you wanted to censor things.
    - Re: (Score:2)
      
      by znrt ( 2424692 ) writes:
      
      The origin of copyright law in most European countries lies in efforts by the church and governments to regulate and control the output of printers.[10] Before the invention of the printing press, a writing, once created, could only be physically multiplied by the highly laborious and error-prone process of manual copying by scribes. An elaborate system of censorship and control over scribes did not exist, as scribes were scattered and worked on single manuscripts.[11] Printing allowed for multiple exact copies of a work, leading to a more rapid and widespread circulation of ideas and information (see print culture).[10] In 1559 the Index Expurgatorius, or List of Prohibited Books, was issued for the first time.[11]
      https://en.wikipedia.org/wiki/... [wikipedia.org]
      (and btw, case in point, another factually correct statement downvoted by zealots)
- - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    But they don't need to, copyright is about making copies of the cake, not eating shit.
    - Re: (Score:2)
      
      by omnichad ( 1198475 ) writes:
      
      not eating shit.
      If it tastes almost as good as the original cake, but at a lower price, the cake stops being produced.
People to US Copyright office (Score:5, Insightful)

by greytree ( 7124971 ) writes: on Monday May 12, 2025 @03:51AM (#65369793)

People of the world to the US Copyright office:

95 years is a fucking abomination.

Copyright is not fit for its purpose of encouraging creativity.

- Re:People to US Copyright office (Score:5, Insightful)
  
  by Errol backfiring ( 1280012 ) writes: on Monday May 12, 2025 @03:53AM (#65369799) Journal
  
  The US Copyright office might even agree. They did not make the rules.
  
  - Re: (Score:2)
    
    by greytree ( 7124971 ) writes:
    
    Jawol. Zey are only following orders.
- Re: (Score:2)
  
  by Zontar_Thing_From_Ve ( 949321 ) writes:
  
  People of the world to the US Copyright office: 95 years is a fucking abomination. Copyright is not fit for its purpose of encouraging creativity.
  The alternative was apparently "forever minus one day" as proposed by Jack Valenti many years ago. So it's better than that. And at least it's finally settled. But yeah, 95 years is too long. Actually, in a few cases it will end up being over 100 years due to the way the law was written, but that only applies to some music in the 1940s if I remember correctly.
Intersting take (Score:5, Interesting)

by thegarbz ( 1787294 ) writes: on Monday May 12, 2025 @04:07AM (#65369821)

Copyright law has never considered speed or volume of production, yet now the copyright office is claiming that precisely this implicates fair use. That said I'm right there with them when it comes to illegal access.
How much did grandma have to pay for downloading one mp3? I hope Meta pays the same amount multiplied by all the works they pirated.

- Re: (Score:2)
  
  by pjt33 ( 739471 ) writes:
  
  Copyright law has never considered speed or volume of production, yet now the copyright office is claiming that precisely this implicates fair use.
  I've only read the summary, but I'm not seeing anything related to speed or volume of production. Am I overlooking something?
  - Re: (Score:2)
    
    by Entrope ( 68843 ) writes:
    
    No, you're not. The assertion you quoted to is entirely false; it is a straw man to stand in for the Copyright Office's observation that AI companies are engaging in large-scale copyright infringement of a very traditional character.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      Except it's not, think about it. If you remove the word "troves" from the quote, then the entire argument is basically completely at odds with decades of established case law saying such work would be permitted under copyright rules. You can use materials to inspire new works and sell those works competing with the original. That is something that has been permitted since the beginning, it is the basis of fair use.
      If the transformative aspect is the same, and the commercial aspect is the same, then there's
      - Re: (Score:2)
        
        by Big Hairy Gorilla ( 9839972 ) writes:
        
        Thank you very much.
        Fair use is you reading a book and maybe even applying knowledge you gleaned.
        The vast industrial scale of harvesting the web is the defining difference between you reading a book and Big AI slurping it up as training data.
      - Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        If you remove the "troves" bit from the quote, then the argument becomes:
        But making commercial use of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries.
        This is still trivially true under copyright law. If you want to point to supposed decades of established case law, then do so. Argument by unbacked assertion is a fallacy. Inspiration is separate from "fair use" -- and there are decades of litigation over where the boundaries are for characters or plots "inspired by" copyrighted material.
    - Re: (Score:2)
      
      by dfghjk ( 711126 ) writes:
      
      IT IS LITERALLY IN THE TITLE! "Fair Use Isn't 'Commercial Use of Vast Troves of Copyrighted Data'."
      What the fuck do you think that means, you moron?
      "...the Copyright Office's observation that AI companies are engaging in large-scale copyright infringement..."
      You literally just admitted that the quote was "entirely true". It's not fair use because it's "large-scale", literally a judgement made on "volume of production". You are an idiot.
      - Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        The title means that infringing copyright involving "vast troves" doesn't make it stop being an infringement of copyright. The logic is right in your second quote: AI companies are engaging in large-scale copyright infringement, not small-scale infringement. It's the core of their business model, not a side line.
  - Re:Intersting take (Score:5, Interesting)
    
    by buck-yar ( 164658 ) writes: on Monday May 12, 2025 @06:29AM (#65369949)
    
    Volume was mentioned in court against mp3.com.
    Sep 7, 2000 A federal judge Wednesday ordered MP3.com to pay as much as $250 million to Universal Music Group for violating the record company's copyrights by making thousands of CDs available for listening over the Internet.
    U.S. District Judge Jed S. Rakoff punished the online music-sharing service at $25,000 per CD, saying it was necessary to send a message to Internet companies.
    Universal Music Group, the world's largest record company, had urged a stiff penalty in a case closely watched by Napster and other businesses that share music or other copyrighted material over the Internet.
    The judge said some Internet companies "may have a misconception that, because their technology is somewhat novel, they are somehow immune from the ordinary applications of laws of the United States, including copyright law."
    He added: "They need to understand that the law's domain knows no such limits."
    MP3.com said it will appeal. The company had argued that a penalty of any more than $500 per CD would be a virtual "death sentence."
    Shares of MP3.com were halted before the decision; the most recent trade was at $7.88 per share, down 68.8 cents on the Nasdaq Stock Market. https://www.utdailybeacon.com/... [utdailybeacon.com]
    
    Imagine $25k per infringement against Meta? Neither can I. They probably lobbied and had the law changed. Or the judge doesn't want to crash the stock market (who doesn't hold meta stock directly or indirectly).
    
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    Yeah not directly, but it was implicit in their decision. The world has a long LONG standing precedent that if work is transformative it is permitted for fair use, even if the result competes with the original. The only thing new here is someone claiming it shouldn't be the case because of "troves" of data being used.
    Volume is the whole basis for the argument here. You can read a book and write a book with similar story elements and styling selling it to compete agains the one which inspired yours and it wo
    - Re: (Score:2)
      
      by pjt33 ( 739471 ) writes:
      
      The "troves" of data being used is volume of consumption, not volume of production, and volume of consumption has always been a factor in fair use and similar concepts. One rule of thumb is that if you're copying more than 5% of the original work, that weighs against the use being fair. (And, anticipating one common argument, legally the training process copies 100% of all the works used, even if less than that ends up encoded directly in the weights).
    - Re: (Score:2)
      
      by jakimfett ( 2629943 ) writes:
      
      Yeah except it doesn't actually and in fact generate works. It regurgitates bits of other people's work, often verbatim, and because those other people's works were taken without compensation, any model trained on said literally stolen data is violating copyright law.
      
      Again, this has nothing to do with volume, stolen data is stolen data whether you're stealing a small amount or large.
      
      Personally I hard agree that any model trained on stolen data should be made available for similar "fair use" access by
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Definitely, one of the most obvious flaws here is the "two rights make a wrong" logic. It's a shitty take on fair use but it serves the interest of the Office.
  At risk of sounding like a victim was raped because she asked for it, here a copyright holder's rights were violated, according to the Office, because they asked for it. You have to realize that, by the standards set forth in the document, copyright infringement doesn't occur when the material is consumed, it occurs before that. It occurs when a by
  - Re: (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    > But this data exists expressly for that purpose and for NO OTHER purpose.
    LIke.. the data here is content on a website?
    
    I think you just defined the difference between consensual sex and gang rape.
The problem isn't only about tech giants (Score:3, Insightful)

by el84 ( 10322963 ) writes: on Monday May 12, 2025 @04:09AM (#65369823)

While it is patently immoral for big internet companies to basically steal the entirety of human creative output, in order to train their stupid (so-called) ai models and fill their future pockets, if we don't do it then bad guys with absolutely no morals in far flung parts of the world will do it anyway, OK we can have the smug sense that we have done the right thing when we are the penniless vassals of our current global technology competitors, who will remain unnamed out of courtesy.

- Re:The problem isn't only about tech giants (Score:5, Insightful)
  
  by martin-boundary ( 547041 ) writes: on Monday May 12, 2025 @04:26AM (#65369835)
  
  While it's immoral to [steal from|rape|torture|slander|kill|enslave] my neighbours, there are people out there, somewhere, who are absolutely willing to [steal from|rape|torture|slander|kill|enslave] my neighbours. I can be smug about not doing it to them myself, but it's just a matter of time until they become victims, so it's really ok if I also [steal from|rape|torture|slander|kill|enslave] my neighbours. Besides, I'm bored thinking about implications.
  
  - Re: (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    It's also immoral not to free trade with them, so as long as the raping is economically efficient you should rape too, otherwise getting outcompeted is only just.
    Free trade, raping all boats.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    The rule of law cannot possibly apply to everyone, so it should apply to no one. Except when I'm in power, then what I say goes. And I'll gain that power by [steal from|rape|torture|slander|kill|enslave] my neighbours.
  - Re: (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    You're not wrong. Just because you stick to some sort of moral code or rule of law doesn't mean someone else will.
    
    So maybe we should apply the rules of law that make our society livable, and just use the AI's coming out of those other place that have no such rules. Then presumably their society implodes under the natural pressures of harvesting the knowledge of their fine peoples and renting it back to them, and we retain a human scale livable society?
    
    Like your father said to you: Just because your friends
- Re: The problem isn't only about tech giants (Score:2)
  
  by reanjr ( 588767 ) writes:
  
  No, they wouldn't. The amount of money spent on this shit only makes sense if you're going to widely commercially target the U.S. and the wider West. Bad actors would never be able to come to market.
  - Re: (Score:3)
    
    by dfghjk ( 711126 ) writes:
    
    I voted for Trump because he said he would hurt the right people, he won't hurt me!
- Re: (Score:2)
  
  by strikethree ( 811449 ) writes:
  
  OK we can have the smug sense that we have done the right thing when we are the penniless vassals
  You were going to be a penniless vassal one way or the other. What difference does it make how it happens?
Nice to see the licensed models mentioned for once (Score:5, Interesting)

by balaam's ass ( 678743 ) writes: on Monday May 12, 2025 @06:16AM (#65369933) Journal

In the report: "Commenters cited several examples of AI tools trained on licensed or public domain content, such as Adobe’s Firefly (an image generator), Boomy (a music generator), Getty Images’ AI image generator, and Stability AI’s Stable Audio (a music generator)."
Often only the infringers get mentioned.

I have said that for ages (Score:2, Insightful)

by gweihir ( 88907 ) writes:

Only to get ridiculed by some AI fanboi assholes, with deranged claims about "learning" and other ludicrous claims. Nice to see the actual experts recognize the problem as well. Take that, AI morons.
- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  The Librarian of Congress is an "expert" on AI now? It's quite early, do you always smoke for breakfast? And where can I get that good stuff from.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    No. An expert on copyright, Try to keep up.
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  This is a shitty, ignorant take made clearly without reading any of the document. Or worse, maybe you did read it and this is what you came up with.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Hahaha, no. I am just not as dumb and disconnected as you are.
not impressed (Score:2)

by dfghjk ( 711126 ) writes:

First, the Copyright Office is not part of the judicial branch. They can voice their opinion, but they are not empowered to say what the law is.
Second, there is clearly a biased narrative at work here. If you look at the very start of the infringement discussion (page 26, Section A) the very first thing you see allegations of "right of reproduction", with the Office saying "commenters agreed with or did not dispute that copying during the acquisition and curation process implicates the reproduction right"
Dangerous extension of copyright concept (Score:4, Interesting)

by orzetto ( 545509 ) writes: on Monday May 12, 2025 @07:45AM (#65370075)

This looks like an alarming extension of copyright overreach if such restrictions are applied to AI. AI reads content (which may be copyrighted, as this post you are reading is, as nearly everything on the Internet is) and learns from it, and that's how it can process a book and provide a summary within a few minutes.
If this were an infringement of copyright, basically any form of human learning would also be. Just reviewing a book, a game, anything copyrighted could be constructed as infringement and prosecuted. Parodies, tributes, quotations. Imagine Leni Riefenstahl suing George Lucas for the final scene of the original Star Wars.
If an AI generates text that is substantially a copy of a copyrighted training input, that's a copyright breach; but AIs can be trained to avoid this, just like people can - learn the concept, avoid copying the form.
The report of the Copyright Office contains the following statement on page 26:
The steps required to produce a training dataset containing copyrighted works clearly implicate the right of reproduction. Developers make multiple copies of works by downloading them; transferring them across storage mediums; converting them to different formats; and creating modified versions or including them in filtered subsets. In many cases, the first step is downloading data from publicly available locations, but whatever the source, copies are made—often repeatedly.
That's the same way any browser operates. For that sake, a lot of browsers pre-download links on a page, so that copies are made locally before any action is taken by the user. Proxy servers also make local copies of often-requested files. If this is infringement, anyone who ever accessed the Internet is a criminal. What if you move a legally-owned copyrighted file from one hard disk partition to another? That would technically require creating a copy.
In practice, the line is drawn when you start distributing (other people's) copyrighted works, which also is the only enforceable one. That is what should be required of AI engines.
Obviously the reason is another: owners of copyrighted work do not want AI to learn their concepts and re-express them (which has always been legal for humans), because their customers will find it easier to ask the AI rather than pay/read the original documents themselves, busting their business model.

- Re: (Score:3)
  
  by dfghjk ( 711126 ) writes:
  
  Great comments.
  "If an AI generates text that is substantially a copy of a copyrighted training input, that's a copyright breach; but AIs can be trained to avoid this, just like people can - learn the concept, avoid copying the form."
  This is the most important point. Infringement occurs when an AI vomits up sufficiently large portions of a copyrighted work. AI's must be developed to avoid this, as this is what we require of people as well. You can read a book and you can have a photographic memory, you ca
How is this different from a human (Score:3)

by MpVpRb ( 1423381 ) writes: on Monday May 12, 2025 @10:32AM (#65370577)

...reading books and using the knowledge commercially?
None of the training data is copied

- Re: (Score:3)
  
  by ledow ( 319597 ) writes:
  
  It's not.
  If a human regurgitates vast portions of a copyright work - whether directly from memory or otherwise - and then sells it as part of a commercial service to other people, they will get in just as much trouble for being outside the bounds of fair-use.
  This isn't about "what an AI can do" vs "what a human can do", it's literally about "is the company's end usage covered under fair use", and wholesale regurgitation of source data (books) can be coaxed out of all LLMs *and* some of these companies are d
When they outlaw AI ... (Score:2)

by ihadafivedigituid ( 8391795 ) writes:

... only outlaws will have AIs.

Welcome to the William Gibson future.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Also a nice example what deep government corruption looks like.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

I cannot see this stopping the AI spiders (Score:4, Insightful)

Re:I cannot see this stopping the AI spiders (Score:5, Interesting)

Re:I cannot see this stopping the AI spiders (Score:5, Insightful)

Re:I cannot see this stopping the AI spiders (Score:5, Informative)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:I cannot see this stopping the AI spiders (Score:5, Interesting)

Re: I cannot see this stopping the AI spiders (Score:2)

Re: I cannot see this stopping the AI spiders (Score:2)

Re:I cannot see this stopping the AI spiders (Score:5, Interesting)

Repeat after me (Score:5, Insightful)

Re: Repeat after me (Score:5, Insightful)

Re: Repeat after me (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Repeat after me (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re:Repeat after me (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Repeat after me (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Copyright is being abused (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: Repeat after me (Score:2)

Re: Repeat after me (Score:2)

Re: Repeat after me (Score:5, Informative)

Re: (Score:2)

Re: (Score:3)

Re: Repeat after me (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Repeat after me (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Repeat after me (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

People to US Copyright office (Score:5, Insightful)

Re:People to US Copyright office (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Intersting take (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Intersting take (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The problem isn't only about tech giants (Score:3, Insightful)