Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Piracy Slashdot.org

Meta Claims Torrenting Pirated Books Isn't Illegal Without Proof of Seeding (arstechnica.com) 170

An anonymous reader quotes a report from Ars Technica: Just because Meta admitted to torrenting a dataset of pirated books for AI training purposes, that doesn't necessarily mean that Meta seeded the file after downloading it, the social media company claimed in a court filing (PDF) this week. Evidence instead shows that Meta "took precautions not to 'seed' any downloaded files," Meta's filing said. Seeding refers to sharing a torrented file after the download completes, and because there's allegedly no proof of such "seeding," Meta insisted that authors cannot prove Meta shared the pirated books with anyone during the torrenting process.

[...] Meta ... is hoping to convince the court that torrenting is not in and of itself illegal, but is, rather, a "widely-used protocol to download large files." According to Meta, the decision to download the pirated books dataset from pirate libraries like LibGen and Z-Library was simply a move to access "data from a 'well-known online repository' that was publicly available via torrents." To defend its torrenting, Meta has basically scrubbed the word "pirate" from the characterization of its activity. The company alleges that authors can't claim that Meta gained unauthorized access to their data under CDAFA. Instead, all they can claim is that "Meta allegedly accessed and downloaded datasets that Plaintiffs did not create, containing the text of published books that anyone can read in a public library, from public websites Plaintiffs do not operate or own."

While Meta may claim there's no evidence of seeding, there is some testimony that might be compelling to the court. Previously, a Meta executive in charge of project management, Michael Clark, had testified (PDF) that Meta allegedly modified torrenting settings "so that the smallest amount of seeding possible could occur," which seems to support authors' claims that some seeding occurred. And an internal message (PDF) from Meta researcher Frank Zhang appeared to show that Meta allegedly tried to conceal the seeding by not using Facebook servers while downloading the dataset to "avoid" the "risk" of anyone "tracing back the seeder/downloader" from Facebook servers. Once this information came to light, authors asked the court for a chance to depose Meta executives again, alleging that new facts "contradict prior deposition testimony."
"Meta has been 'silent so far on claims about sharing data while 'leeching' (downloading) but told the court it plans to fight the seeding claims at summary judgement," notes Ars.

Meta Claims Torrenting Pirated Books Isn't Illegal Without Proof of Seeding

Comments Filter:
  • by OrangAsm ( 678078 ) on Thursday February 20, 2025 @10:36PM (#65183733)

    We didn't fuck unless I seeded.

  • bs (Score:4, Insightful)

    by migos ( 10321981 ) on Thursday February 20, 2025 @10:41PM (#65183741)
    They used stolen content for business purpose, i.e. to make money. That's where the line is drawn. It's one thing to download and consume at home for personal use, another to make money off it.
    • They used stolen content

      Nothing was stolen, everyone still has all the files they had before. This is about copyright infringement. Copyright is (speaking in digital terms) a temporary monopoly on certain manipulations of certain strings of bits. Infringement is stepping on that.

      • Re: (Score:2, Insightful)

        by sg_oneill ( 159032 )

        Nothing was stolen, everyone still has all the files they had before. This is about copyright infringement. Copyright is (speaking in digital terms) a temporary monopoly on certain manipulations of certain strings of bits. Infringement is stepping on that.

        Arguing with definitions is a useless way to not say anything at all.

        Engage with the argument, stamping your feet and demand people use words in the way you approve contributes nothing.

        • by Anonymous Coward on Thursday February 20, 2025 @11:28PM (#65183823)
          It's not "stamping your feet and demand people use words in the way you approve." These are the terms used by law. The law says copying non-rivalrous goods is copyright infringement, not theft. It has nothing to do with "arguing with definitions." This is the definition that matters.
          • by sg_oneill ( 159032 ) on Friday February 21, 2025 @12:43AM (#65183883)

            What is your point? How does that add anything constructive to the point the person was making.

            Pretending you dont understand what is being said because you pretend you dont understand their meaning is a disengenous way to engage in discourse.

          • by dfghjk ( 711126 )

            But your quoted use of "stolen" was intended only to claim that a crime was committed, not what the legal definition of that crime was. Knowing that, you chose to engage in a pointless argument over precisely the definition of the crime, as if the crime not being "theft" invalidates that a crime was committed. Worse yet, your argument is tired and useless, it is decades old. What you are doing is worse than merely "stamping your feet", you're having the same temper tantrum that countless people more clev

        • by Kokuyo ( 549451 ) on Friday February 21, 2025 @03:34AM (#65184079) Journal

          The lack of precision in speech is exactly what lead to the current Zeitgeist.

          Pirating is not theft. It not being theft does not absolve it of ethical misconduct, but theft it ain't.

          There is a reason we have many different words for killing a human, too. Not every death by the hand of another is murder.

          • by dfghjk ( 711126 )

            "The lack of precision in speech is exactly what lead to the current Zeitgeist."

            No, it is merely a symptom.

            "Pirating is not theft. It not being theft does not absolve it of ethical misconduct, but theft it ain't."

            It may not be theft of the material, but it is theft of the copyright holder's right to control distribution. And "theft" is not a legal term nor is "pirating", and the question is not whether conduct is "ethical". You know, someone recently said "the lack of precision in speech is exactly what l

      • Re: (Score:2, Insightful)

        by msauve ( 701917 )
        >Nothing was stolen, everyone still has all the files they had before.

        The author's legal right to control the distribution of their works was taken from them. That's stealing.
    • Re:bs (Score:4, Insightful)

      by rsilvergun ( 571051 ) on Thursday February 20, 2025 @11:16PM (#65183809)
      No, the line is drawn at "can this person beat our lawyers in court?".

      No sense pretending the laws apply to a company as profitable and useful for propaganda as Facebook.
    • Re:bs (Score:5, Insightful)

      by ArchieBunker ( 132337 ) on Friday February 21, 2025 @12:24AM (#65183859)

      Fine them $250,000 per violation just like that scary warning before a movie.

    • To repeatedly repeat myself repeatedly: content theft is the new definition of free speech.
  • has the law changed? (Score:5, Informative)

    by cawdor ( 10162661 ) on Thursday February 20, 2025 @10:50PM (#65183755)

    I thought that "making a copy" ie downloading copyrighted works without permission from the copyright holder is illegal. To not clog up the courts, generally the end user is left alone unless they made profit from the infringement or infringed copyright on a large scale. You could argue that both is true for Meta. So unless the law has changed, Meta is clearly guilty and should be made to compensate the right holders and pay fines or have the exec in charge go to jail, just like anyone else.

    Otherwise anyone could just leech movies legally, by Meta's argument.

    • back in November. Lots of folks seemed to have missed that. We had a choice, and we made it. I mean, not counting the approximately 7m Americans who were prevented from voting. But what's a little voter suppression & Jim Crow among friends?
      • by caseih ( 160668 ) on Thursday February 20, 2025 @11:54PM (#65183837)

        Russia still has laws, very strict ones even. All dictatorships do. But they don't apply to the Putin's friends, those doing his bidding, and those who've paid their oligarchal dues. Unless of course they fall out with Putin and have an accident.

        And it's the same in the US now. Just because the president and his cronies are violating the law and trampling on the constitution doesn't mean that you, a mere citizen, can do the same. It's been this way for quite a while (best justice money can buy), but certainly it's going to get a lot worse now.

        • Can you please stop ruining my days?
        • by mjwx ( 966435 )

          Russia still has laws, very strict ones even. All dictatorships do. But they don't apply to the Putin's friends, those doing his bidding, and those who've paid their oligarchal dues. Unless of course they fall out with Putin and have an accident.

          And it's the same in the US now. Just because the president and his cronies are violating the law and trampling on the constitution doesn't mean that you, a mere citizen, can do the same. It's been this way for quite a while (best justice money can buy), but certainly it's going to get a lot worse now.

          It's been that way in the US for some time now. The only thing that's changed is that they're not even bothering to hide it any more.

          • by dfghjk ( 711126 )

            But it was explicitly codified last year. We all witnessed the brazen criminality of Trump 45 followed by the refusal of the Biden administration or the Congress to do anything about those crimes, but now the courts have ruled that the President is above the Constitution.

        • by _merlin ( 160982 )

          He probably meant to say that you're no longer a nation with "rule of law", i.e. the idea that the law applies to everyone equally. As you say, all countries have laws. There's no country where the laws are truly applied equally and fairly. The US has had a problem with rich people being able to get away with things poor people can't for a long time. You've now decided that the president is above the law. You seem to have dropped the pretense of rule of law altogether. Things are only going to get wor

      • by dfghjk ( 711126 )

        Technically, it was when SCOTUS invalidated the Constitution in July. They declared the President immune from law and therefore immune from the Constitution (which is merely part of the law). And it wasn't in November that it took effect, it was the following January. We are now in a post-constitutional period, Americans simply don't realize it yet. Instead, we are cosplaying like Trump is going to have any respect for law at all. He does not.

        The laughable part is the talking heads arguing over what ha

    • Facebook doesn't have to make a copy of the book. They could feed the download stream straight to the ai tokenizer.

      • So if you directly stream, but do not save media it's fine? So if I set up an auditorium and have people pay me to see a movie, and I stream a movie (but do not save it) it is fine? So if I stream a movie and save a deep analysis of the movie that could be used to mostly recreate that movie a little differently without giving any credit to the studio or actors (but only watch the new version) it is fine? What if I root through the media files on your computer and use them to train a model to let me see a sl

        • Bro you just described a performance. Copyright law was enshrined in case law which gave a very clear definition on what violates copyright. A performance where you show the material to other people is in violation. Distribution is also a violation. Seeding is distribution, streaming is a performance. Both violate copyright. Meta was doing neither.

          Chilling effects: now the people understand why they must leech and never seed anything they pirate on torrent. More chilling effects: the people can train their

          • Your last sentence is crucial.

            People (sic) seem to forget that the idea of copyright law (any law in fact) is to benefit WE THE FUCKING PEOPLE.

            WE give the copyright privilege to creators to ENCOURAGE CREATIVITY and thus to benefit the people.
            We do not give them that privilege to fuck us over. Repeatedly. For 95 years.
      • The stream is a copy already.

      • by gweihir ( 88907 )

        Facebook doesn't have to make a copy of the book. They could feed the download stream straight to the ai tokenizer.

        Legally, that is called "making a copy". It does not matter what you do with that copy afterwards.

        • Re: (Score:2, Insightful)

          by i.r.id10t ( 595143 )

          So people with eidetic memory can only read stuff that is in the Public Domain or perhaps some stuff with Free-like CC licenses attached?

      • by msauve ( 701917 )
        So, if someone streams a movie, re-encodes it, and then saves it, they're fine? Will you pay for my legal defense if I do that on a large scale?
    • Not sure what the law in the US is, but in Canada downloading isn't actually illegal. It's the uploading (aka sharing) that gets you in actual trouble.

      That kind of sounds like what's happening in this instance. They didn't share anything.

      • Training an AI model on data and then sharing that model IS sharing it. If I can type "write me a version of Little Women in the writing style of Stephen King" into an AI model and get workable results, they are sharing that data.

      • by gweihir ( 88907 )

        Is that an exception for private downloads or for commercial ones as well? Because I somehow doubt the latter and what Meta did was commercial.

        Also, LLMs have shown time and again to sometimes output their training data. Hence the "sharing" is there as well.

    • Yes, copyright restricts all reproduction except where exempted. That's why there are specific exemptions for caching, private backup, fair use etc etc.

      Fair use or bankruptcy from statutory fines on registered works alone, those are the options for most of the industry.

    • If you just download a copyrighted work, the actual losses suffered by the copyright owner are pretty minimal, maximum the cost of a legal copy of the work.
      If you use it for commercial purposes or distribute it to others, then the losses suffered by the copyright owner are much higher. In the UK at least, it also makes it a criminal rather than merely civil matter.

  • by Anonymous Coward
    Even if they managed to download all ze torrentz while disabling any seeding, they'll still be sharing the data with everybody that uses their AI. It's still mass piracy regardless of whether they used FTP, HTTP or Torrents to get it.
  • Aaron Swartz (Score:5, Informative)

    by krakrjak ( 227602 ) <krakrjak@NospaM.gmail.com> on Thursday February 20, 2025 @10:52PM (#65183763) Homepage

    So when Meta does this it's altruistic, but when Aaron Swartz does it, it's a federal crime with 25-life. Make it make sense.

    https://en.wikipedia.org/wiki/... [wikipedia.org]

    • Re:Aaron Swartz (Score:4, Insightful)

      by Brain-Fu ( 1274756 ) on Thursday February 20, 2025 @11:13PM (#65183805) Homepage Journal

      Make it make sense.

      Ok:

      Rich and powerful businesses operate under a different set of laws than individual people.

      That's pretty much it. I can elaborate a bit though:

      Laws aren't handed to us by God. They aren't discovered by the scientific method. They are invented by human beings. In particular, they are invented by rich and powerful human beings who all share a common motivation: to remain rich and powerful. So, the purpose of the law is to protect their wealth and power.

      Ostensibly it is to ensure fair and equal treatment for everyone, keep everyone safe, etc. That's mostly true only inasmuch as such fairness and safety are necessary to keep powerful people powerful. It's not true in some lofty, philosophical, "everyone is equally important" sense. Sentiments like that are just there to get public buy-in so that people don't revolt.

      Meta, as a legal entity, is simply more important than any of the individual authors of those works. Their ambitions of creating a better LLM, are more important. So, the little people will be made to move. Their grievances will be heard, paid lip-service-to, and then ignored. There might be some token efforts, like some kind of legal clarity that will make it crystal clear that no other little people are allowed to do this sort of thing. There might even be something of a slap on the wrist to satisfy the mob's desire for vengeance.

      But Meta will not be treated like some uppity nobody. Meta will be permitted to pursue its ambitions.

      • How and why does our species allow this?
        • by MikeS2k ( 589190 )

          Have you experienced the intelligence of the average human? George Carlin was right when he said we get the rulers we deserve in a democracy - we elected them after all did we not? If the average person had any intelligence there would have been mass protest votes for third parties since at least the 1980's

          • I have experienced human intelligence at both extremes and along the spectrum, and I agree that it is depressing on aggregate. I question whether intelligence really even exists in our species (maybe it's some kind of illusion) and whether that would be a good thing for the planet anyway. I prefer consciousness.
    • Re: (Score:2, Interesting)

      by timholman ( 71886 )

      So when Meta does this it's altruistic, but when Aaron Swartz does it, it's a federal crime with 25-life.

      Let me flip that around for you. When Aaron Swartz did it, his actions were morally justified and his alleged crimes completely excused by the Slashdot crowd. But if Meta trains its AI on the same pirated material that Swartz acquired and distributed on the Internet, suddenly it's a federal crime that needs to be prosecuted.

      I personally despise Meta, but you could slice the hypocrisy with a knife on th

      • Aaron Swartz copied stuff to give everyone access to science that was largely funded by the public purse.

        Meta copied a bunch of stuff which wasn't publicly funded so they could provide off it.

        Do you really not see that beyond "copying" they are different?

    • I for one hope we finally have someone with deep pockets who can fight this bullshit. After all we just had a story about the RIAA getting personal information about internet subscribers again.

      You don't need to like Meta to support them using their money to upend the currently broken interpretation of copyright law.

    • Re:Aaron Swartz (Score:4, Informative)

      by gweihir ( 88907 ) on Friday February 21, 2025 @03:06AM (#65184043)

      Except Swartz did it non-commercially. Meta should have the Rico Act thrown at them.

  • This... looks bad (Score:5, Insightful)

    by quantaman ( 517394 ) on Thursday February 20, 2025 @10:56PM (#65183771)

    I've actually been somewhat understanding towards the LLM companies when it comes to the data acquisition. Like you can't really train these models without hoovering up the Internet and it is kinda tricky making sure you don't scoop up some improperly shared copyrighted material along the way.

    But Meta didn't accidentally downloaded copyrighted materials. They deliberately sought out (and reshared) pirated materials, and tried to hide their involvement.

    They should get hammed big time for this.

    • You're far too forgiving of these cunts.

    • by gweihir ( 88907 )

      Indeed. A criminal conviction is the least they should get. I mean, they did commercial copyright infringement on mass-scale, perfectly knowing what they did. Should result in prison time for the decision makers at the very least.

    • Public access is not public domain. Before the DMCA you had implicit license to make copies for the purpose of viewing the web plus fair use, with the DMCA there are some explicit exemptions and fair use.

      The only difference between websites and books is that most websites aren't registered works, so the owner ha to prove damages to get awards.

      • by msauve ( 701917 )
        >you had implicit license

        But that's not necessary. Ever heard of an HTTP request? You're asking for, not taking, content. The server, which is under the control of the copyright holder, decides whether to provide you with a copy. If they're not the copyright holder, it may be illegal copying and distribution.
        • Public access is not public domain. They gave one copy to a router, there's a dozen more transient copies made before you even see it ... and the implicit license ended there.

          • by msauve ( 701917 )
            You don't know how the Internet Protocol (IP is ambiguous here) works. One can send or receive. The copyright holder is sending content, any and all temporal copies made along the way are made under their direction.

            >Public access is not public domain.

            Yes, but that only means you can't make more copies than the one you were ultimately given.
    • by zmooc ( 33175 )

      So what? I do exactly the same when training my flesh and blood LLM. Does the implementation really matter that much? And if yes, why?! Just about all "knowledge" is encompassed in copyrighted stuff. As is just about anything you consume with your eyes that is not nature, ranging from your clothes to this comment and from the method you used to learn arithmetic to you singing your favorite song or reading a book from the library.

      Is everything I do a derivative of a copyrighted work? Yes. Should I therefore

  • This is so Facebook (Score:5, Interesting)

    by Rosco P. Coltrane ( 209368 ) on Thursday February 20, 2025 @11:05PM (#65183785)

    Not only did they torrent a ton of shit like petty pirates, they didn't even contribute bandwidth back by seeding for others.

  • Incongruities (Score:4, Insightful)

    by az-saguaro ( 1231754 ) on Thursday February 20, 2025 @11:09PM (#65183795)

    Meta ... is hoping to convince the court that torrenting is not in and of itself illegal, but is, rather, a "widely-used protocol to download large files."

    True, if you want to exchange or download large files, a protocol is needed - but that does not imply your legitimacy to download any file you want or what you can do with it.
    I can lift a can of beans off the grocery shelf and put it in my cart - a "widely-used protocol to obtain canned food". I can then go to checkout and pay, or I can go to door and hope not get caught.
    I can drive my car to the bank - a "widely-used protocol to obtain cash". I can then do a legal transaction at the teller window then drive away happy, or I can rob the teller and getaway in the car.

    Does Meta believe the PR shit it spews? And - with all due respect to legitimate attorneys - do lawyers believe this self-serving nonsense when they make it up and spew it?

    "Meta allegedly accessed and downloaded datasets that Plaintiffs did not create, containing the text of published books that anyone can read in a public library, from public websites Plaintiffs do not operate or own."

    Yes, that is what libraries are for. What I cannot do is got to library, borrow book, photocopy or scan it, then republish and sell it as my own work.
    Of course, libraries are probably useless to the illiterate, and being a modern tech company seems to make its execs functionally illiterate, unable to read the law or any code of decency.

    • unable to read the law or any code of decency

      Ah, so laws represent decency? That's cute.
      Just out of curiosity, let's say it's your job to write the laws around the downloading of copyrighted material.
      If your son was to download a copyrighted book from a publicly accessible URL, but then delete the file without sharing it, what should his punishment be?
      What if he downloads a zip file with 1000 books?
      What if he downloads a zip with every book in existence?
      What if he writes an app that allows people to search for the title of a book they're looking

    • by gweihir ( 88907 )

      The whole thing is a nonsense argument. They are stalling. Torrenting is not and has never been illegal. What is illegal is downloading stuff and sharing stuff without permission. And they did it commercially and on mass-scale.

  • The same about random numbers, not random unless the random number generator is seeded.

    Or, its not illegal or wrong, unless you get caught seeding/doing it...

    More simply, companies in the past caught doing something illegal, well they made $1-billion and then had to pay a $1-million fine, but denied any wrong doing.

    JoshK.

  • commercial companies can get away with anything. It's ridiculous, all you need is a good lawyer and there you go... even pirating is ok for the wealthy. And the crazy thing is Zuck could win this one because facebooks value is 7x of the movie industry. What isn't fair is all the people buried by other companies.

  • by Petersko ( 564140 ) on Friday February 21, 2025 @12:15AM (#65183849)

    I'm not a particularly experienced seeding, but on the occasion that I did, I noticed that uploads of packets started pretty quickly, even though the file was incomplete. "Seeding" happens when the download is finished... but doesn't sharing start right away?

  • by pipatron ( 966506 ) <pipatron@gmail.com> on Friday February 21, 2025 @12:18AM (#65183853) Homepage

    As someone who is generally for filesharing and very much against the near-infinite copyrights we have right now, I think it is a good thing that a massive mega-corporation is finally on "our" side.

    It used to be Disney and Microsoft vs the common people so obviously Disney won. Now the Disney attorneys can battle with the Meta attorneys, and I'm sure Microsoft is going to keep its mouth shut unless someone makes them admit to doing pretty much the same thing.

    • Re: Finally (Score:4, Interesting)

      by greytree ( 7124971 ) on Friday February 21, 2025 @02:24AM (#65183995)
      Can we dream that this case will lead to corporate pressure for fair, 5 year, copyright terms and the copyright privilige once again being used for its intended purpose of *encouraging* creativity ?
    • They are not. The enemy of my enemy is not my friend.

      What's far more likely is meta will either get some kind of exception, or a token fine, nothing like the equivalent of 10 years in a federal prison.

      I can guarantee you this will not lead of loosening of anything for individuals or companies smaller than megacorps.

      I truly wish it were not so but I think that is the world we live in. Twas ever thus

  • Honestly I'm hoping Facebook "wins" and also "loses"

    I hope they win, in that "using torrents is not in itself copyright infringement", but also "ingesting unlicensed materials without attempted to license, is still copyright infringement"

    • I was going to post something similar to this, if they can argue that downloading copyrighted materials is legal but sharing is infringing doesn't that mean that the last 2 decades of lawsuits against residential customers by the *AA wouldn't be valid any more?

      I'm too lazy to look it up but I recall one of the *AAs using the IP addresses of users downloading from a torrent to try and force ISPs to identify the customers so they could sue them. I however don't recall that seeding was a requirement, jus
  • Even if they're not stealing content, which they are, if they're all pulling often, they're raising hosting costs for everyone, and nobody gives a crap about robots.txt. And I mean pulling without attribution, not like ethical search engines, if such exist.
  • and if I did, it wasn't illegal! and if it was illegal, you've got no proof!

    Sounds like something Bart Simpson would say?

  • by Jayhawk0123 ( 8440955 ) on Friday February 21, 2025 @01:06AM (#65183921)

    I would really love to see how they managed to download such a vast trove of works WITHOUT seeding a single packet of data.

    As anyone who torrents can attest... turning seeding off or throttling it essentially kills the download... at the file sizes they were downloading, to torrent those with seed off would take years.

    They're flat out lying and full of shit.

    And no.. stopping it at the firewall would have also killed it...

    So them saying there is no proof means they simply torrented from an IP not linked to meta. which they essentially said when they said they took care to not use meta servers... which would also mean a guilty conscience and an active attempt to conceal them breaking the law. They're making the case against them. If the court sides that making copies by the simple act of downloading it... any copies made during the training process... putting the works into memory... a drive backup, etc... of others works for your own profit is legal, it will cause a drastic shift how the laws are interpreted. NOT GOING TO HAPPEN.

    But that's only if META's actions are treated similarly to how a person would be prosecuted. (which on a side note... the people who did it should be charged, not just the company... it was a human doing it... might make people think twice about doing shady things on behalf of companies thinking they have immunity.)

    Meta's only course at this stage is to go sealed court records, NDA's and a cash settlement.

    • by gweihir ( 88907 )

      Indeed. Sunds like a highly criminal endeavor. Probably need to find that part of Meta to be a criminal enterprise.

  • Charging for access to a derivative work would be a big no-no. Is an LLM a derivative work, or is it more like a child you are teaching to read by going to the library? I wouldn't think it illegal for a minor to learn how to read or become a writer by downloading books and not reselling them, though some writers or some countries might say you should go to a library where they have bought a copy. I don't think the "I only smoked second-hand" defense is worthy of them (or maybe it is *just* like them..) but if they had the courage to make it, they might actually have a point. That in the modern day there *might* be use-cases for massive data acquisition and processing that are to the public good. I don't think the current Wild West approach is.

    Back in a writing class I remember the story of one famous writer who (on a mechanical typewriter, in the day) typed in the books of other authors they admired to get used to their writing. I don't know if Meta could argue they are poor, but they did release it for free, so I'd guess they are in a better situation than closed source vendors. But ultimately there probably would need to be a law about accepted uses for massive datasets regardless of provenance. Some such uses might be for private use, for LLM training if given back to the public as open source, statistical analysis, or to enable noncommercial search engines such as for home use or in a library or school. Certainly there is apparently no other legal way to search through book content than having the data on your hard drive or going to Google Books or maybe Google Cite if that is still a thing, not even sure if search works there.

    So in that sense, it is either utterly illegal or there are shenanigans going on with special cases ironed out for corporations with billions of dollars. This lawsuit might be a good opportunity to specify some use cases where authors are not getting infringed, rather they are being promoted, learning is being promoted, and discovery of works based on either knowing the exact words you want to find or can specify to an AI, is promoted. Those all seem like good things. The problems come from billionaires making more billions from authors and artists without compensating them, and creating works based on their styles that effectively put them out of business and limit opportunities for young writers and artists to make a living. Generative AI currently has boundless opportunity for expansion and without recognizing those dangers I cannot see people whose creativity is there livelihood being complacent about teaching AIs with their work.

    Google Books lets you preview the inside of books. I don't know how it has changed over the years and there were a bunch of lawsuits at one time, but I notice The Stand by Stephen King is there, including its cover art, it doesn't seem to reproduce a page about no copying allowed though. The Cat Who Walks Through Walls by Robert Heinlein doesn't seem to have a preview (though it has different versions, maybe one does) but it does have links to purchase on Amazon and an online bookstore local to me it seems. The Catcher in the Rye has a weird cartoon font cover page saying it is from Bibliomania Publishing in Egypt.. no clue if this is a weird scofflaw publisher or the real deal.

    Anyway, I leave it to Meta and others, probably the billions mean there will be a legal loophole for LLMs and this is one of the tech bros' interests in the current U.S. administration. But if such loopholes can include education and provision for private use then I could see the Library of Congress and similar national libraries in other countries playing a role in scanning and hosting the torrents. Since there actually are good arguments for being able to search and index by concept the collected culture of the world, while promoting authors' rights and livelihoods in a balanced, legal manner.

    • by gweihir ( 88907 )

      Machines are not humans and do not have the privileges humans have. "Machine learning" is not learning. The term is used as a simplifying analog.

  • They are grabbing information, passing it through their systems, then distributing it from those systems. The protocols are not the issue, it is the acts that matter. It doesn't matter if they used IP over carrier pigeon to transport the bits into and out of their systems. They are taking information then profiting off that information.
    • by gweihir ( 88907 )

      Indeed. This is commercial copyright infringement, on mass-scale, and that comes with criminal penalties. Might even find that part of Meta to be a criminal enterprise.

      I mean, if this was some single mom downloading something (i.e. non commercial downloading), she would be threatened with prison time and a few millions in fines.

  • As long as we disable seeding, that is? Well, in that case AI would have had at least one positive effect.

    Looking forward to torrenting the last Disney movies! Or not.

  • by bradley13 ( 1118935 ) on Friday February 21, 2025 @02:54AM (#65184027) Homepage

    FWIW that is the situation in Switzerland. If something is available on the internet, you can download it. It is, however, illegal to *provide* content that you do not have the right to distribute.

    For the consumer, this seems like a completely fair solution. Whether it should apply to companies is, perhaps, a different question.

    • by ledow ( 319597 )

      So if you download a Windows ISO or a copy of Oracle or whatever, you can just use it in perpetuity, then, right

      So why does any company or person in the entire country pay for software, books, artwork or anything else?

      I don't think that's how it works at all, or you're greatly oversimplifying it.

      • Well, of you aren't a giant corporation you can already get away with downloading whatever the hell you want. What's the worst that can happen, you get an email from your ISP saying, "we don't care but please don't"? And yet people still prefer to do it legally.
  • Worse than copyright felons, they are a bunch of damned LEECHERS.

    Just shun them . . .

  • OK, so copying something isn't violating copyright, as long as you don't let others copy what you copied. Everyone should be copying their school text books instead of buying them then, because that's legal. Libraries have a problem, though.

    • If running one of history's worst acts of industrial-scale piracy for profit was fine because they didn't upload, then pirating anything from Napster or Megaupload should've been 100% legal.

  • I bet a lot of pirates find this interesting, as people suing usually seed themselves and wait who beings to load from them. This way they only prove the download.

    I think the usual argument is "Only who is seeding causes 1 fantastillon in damages" so downloading is not worth going to court as the process would be only about like 30 USD in damages, if the user was not seeding (and thus starting the chain reaction of all others loading the content!). This defense will be hard, if you downloaded Terabytes of u

  • I would like facebook to show how they distributed the files to their training systems.

    Did they not make any copy that a human in their company had access to?

  • I didn't rob a bank. I just took money from someone else that robbed a bank, but didn't spend opang of the money. No crime! /s

  • was set up a team to investigate torrent piracy to determine if it may impact their or their customers business. And that like many anti-piracy groups, engaged in torrenting to gather data about how and by whom pirated material spreads. For which they would just happen to use the datasets in question.
  • by Pop69 ( 700500 ) <.ku.oc.ytraneb. .ta. .yllib.> on Friday February 21, 2025 @08:35AM (#65184425) Homepage
    That's not what people have been jailed and fined for by the RIAA in the past, has the law been changed?

Hotels are tired of getting ripped off. I checked into a hotel and they had towels from my house. -- Mark Guido

Working...