Forgot your password?
typodupeerror
Piracy Australia Movies Television The Internet Your Rights Online

Study Finds 0.3% of BitTorrent Files Definitely Legal 321

Posted by timothy
from the can-we-call-that-vanishingly dept.
Andorin writes "It's common knowledge that the majority of files distributed over BitTorrent violate copyright, though the exact percentage is unclear. The Internet Commerce Security Laboratory of the University of Ballarat in Australia has conducted a study and found that 89% of files examined were in fact infringing, while most of the remaining 11% were ambiguous but likely to be infringing. Ars Technica summarizes the study: 'The total sample consisted of 1,000 torrent files—a random selection from the most active seeded files on the trackers they used. Each file was manually checked to see whether it was being legally distributed. Only three cases—0.3 percent of the files—were determined to be definitely not infringing, while 890 files were confirmed to be illegal. ' The study brings with it some other interesting statistics; out of the 1,000 files, 91 were pornographic, and approximately 4% of torrents were responsible for 80% of seeders. Music, movies and TV shows constituted the three largest categories of shared materials, and among those, zero legal files were found."
This discussion has been archived. No new comments can be posted.

Study Finds 0.3% of BitTorrent Files Definitely Legal

Comments Filter:
  • by Anonymous Coward on Saturday July 24, 2010 @12:18AM (#33011236)

    Choosing the most popular seeds gives very skewed results. I bet the overall percentage of pornographic torrents is much higher than 9%. Similarly, we may see a large change in the number of legal files.

    • by JavaBear (9872) * on Saturday July 24, 2010 @12:29AM (#33011294)

      I was about to point out the same, most legal seeds are probably not among the most active. I'm not trying to be apologetic about the rampant piracy that Torrents are also used for, however saying that only 0.3% are legal is misleading, using the selection criteria they did, and a relatively small sampling at that.

      • by Anonymous Coward on Saturday July 24, 2010 @02:07AM (#33011676)

        I think that's the point. If they did a proper random sample, let's say they ended up with 50% legal, 50% illegal, it wouldn't mean much if the illegal torrents accounted for 99% of the bandwidth/users.

        • Re: (Score:2, Insightful)

          by Anonymous Coward

          No, but they also didn't rule out that 50% of the bandwidth was used for legal files.
          Say for example that the 1000 most popular torrents are infringing torrents and they use 1TB/s of bandwidth each.
          Then you have 1000000 non-infringing torrents that only use 1GB/s each.
          If their method were to be applied on this scenario they would find out that 100% of the torrents were used for copyright infringing when the reality was that less than 0.1% of the torrents and 50% of the bandwidth was used.
          This is the things

    • You'd think the seeders would be more helpful. How about a legal/illegal tag on the torrent?
  • 0 media legal (Score:4, Insightful)

    by LoudMusic (199347) on Saturday July 24, 2010 @12:21AM (#33011244)

    I think the zero legal music / tv / movie files can be attributed to those types of files that are legal to distribute are usually just done so by http or ftp servers. They don't get put into a torrent type download system.

    I'm not surprised that 4% of the files were being downloaded by 80% of the community. I bet the #1 file was being downloaded by more than 50% of the community. Individuals can, and often do, download more than one file at a time.

    • Re:0 media legal (Score:5, Insightful)

      by wisnoskij (1206448) on Saturday July 24, 2010 @12:32AM (#33011308) Homepage

      I completely disagree, a lot of times free media gets put into torrents and sometimes is the only way to even get it.
      People that are not making money do not have the money to pay for the bandwidth to distribute to many people.

      for example see Pioneer One.

    • Re:0 media legal (Score:4, Informative)

      by cgenman (325138) on Saturday July 24, 2010 @01:00AM (#33011458) Homepage

      I'm actually a little disheartened by the lack of legal torrent distribution. It's a great medium for getting your content out there, people! If you're doing a straight HTTP server for your files, you could be saving a lot on bandwidth (and helping people to get your content faster) by setting it up as a torrent.

      • I'm actually a little disheartened by the lack of legal torrent distribution. It's a great medium for getting your content out there, people!

        A while back the absolutely fantastic NPR show This American Life [thisamericanlife.org] would precede every podcast/mp3-edition with a plea for money to pay for their relatively gynormous bandwidth costs. I wrote to them suggesting they try out bittorrent and to Bram Cohen's company suggesting they use TAL to showcase the commercial benefits of bittorrent for legitimate distribution - win/win for everybody.

        Alas nobody paid any attention to this joe random emailer, not even a cursory "thank you for your email."

        • by brit74 (831798)
          Maybe because podcasts need to tie into the system. I'm pretty sure that iTunes can't download torrents.

          Bram could actually do himself a great big favor towards this kind of legitimate adoption if it were actually possible to police torrents. Of course, I think he knows that Torrents are effectively the lawless wild-west, and he realizes that's his nitch in the market - enabling the delivery of illegal wares outside the long arm of the law. Torrents would be a pretty obscure technology if it wasn't fo
        • It might have something to with the fact that (for all his good points) Ira Glass seems to be a grade A luddite =) [his interview with Jesse Brown on Search Engine told me so]. NTTAWWT - that probably helps him make the show as wonderful as it is. I'm amazed he actually did a TV series (available on Netflix instant if you're interested).
      • by Ahnteis (746045)

        They only used certain trackers. I doubt they included (fex) Blizzard's updaters which probably provide a pretty hefty amount of traffic in-and-of themselves. But those wouldn't show up on most trackers I'm guessing since they probably use a Blizzard tracker, not a [website] tracker.

    • Re:0 media legal (Score:5, Insightful)

      by slashqwerty (1099091) on Saturday July 24, 2010 @01:18AM (#33011524)
      How do they know what is or is not legal? With Viacom caught paying third parties to upload their material to YouTube and then suing Google for distributing the material it appears the copyright holders don't even know which content is legal.
      • by rtb61 (674572)

        That is the problem, their analysis failed in two regards. Firstly it was not a truly random sampling but only the 1000 most active seed, which they claimed as random by glossing over the fact there were many equally popular seeds. Their second failure was in not defining the method they used to validate what was and was not infringing content, they just made an assumption, a guess, now that's really quite unscientific and a major fail.

        It would also be interesting to note of those 1000 files how many con

    • by burris (122191)

      My favorite tracker [etree.org] is 100% non-infringing music. In fact, BitTorrent was created in part to satisfy the needs of hippie concert tapers/traders. It didn't take long for BitTorrent to completely kill off use of FTP by Etree users.

  • Boo hoo hoo. (Score:4, Interesting)

    by girlintraining (1395911) on Saturday July 24, 2010 @12:22AM (#33011248)

    1986: Hey man, want a copy of this movie I got? Sure, I'll just pop it in my VCR and make a duplicate.

    2010: Hey man, want a copy of this movie I got? knock, knock Aw crap, it's the police! *thud* *smack* ow! ow! ow!

    RIAA -- Advocating social and technological progress since... ha ha, never you dopes!

    • by brit74 (831798)
      More like:
      1986: Hey man, can I get a copy of [name of movie]?
      98% of the time: no I don't have it. Guess you'll have to find some other way to get it.
      2% of the time: yeah, here's a degraded copy of that movie. Sorry about the poor quality.

      2010: Hey man, can I get a copy of [name of movie]?
      Which movie do you want? I've got high quality copies of 100,000 movies, many of them haven't been released to DVD - they're still in movie theaters, and some of them haven't even made it to movie theaters. Why
      • Re: (Score:3, Insightful)

        by internewt (640704)

        Why would anybody pay for anything?

        Indeed. Computers exist, and are getting faster, smaller, and cheaper as time goes on. High speed data lines into places of work, homes, and pockets exist.

        Do you expect people not to use this stuff, especially when downloading can be more convenient than obtaining it "legally"?

        The economic realities are that the tech is not going away, and humans are human. Trying to make snide comments about people pirating stuff is more of a waste of time that trying to stop the piracy!

  • Princeton Study (Score:5, Informative)

    by cappp (1822388) on Saturday July 24, 2010 @12:22AM (#33011252)
    In a similar Princeton study [freedom-to-tinker.com] the numbers were a little different but the general point remained the same.

    46% movies and shows (non-pornographic)
    14% games and software
    14% pornography
    10% music
    1% books and guides
    1% images
    14% could not classify

    They ultimatly found approx. 1% to be legal.

    The Princeton piece makes for an interesting read because they do a good job of breaking down their catagories and providing some detailed context. For instance, 53% of the porn was in English and 5% of the software was Spanish language. Just really rich data for anyone into this kind of analysis. The final paragraph on how they decided if content was illegal reads:

    Our final assessment involved determining whether or not each file seemed likely to be copyright-infringing. We classified a file as likely non-infringing if it appeared to be (1) in the public domain, (2) freely available through legitimate channels, or (3) user-generated content. These were judgment calls on our part, based on the contents of the files, together with some external research. By this definition, all of the 476 movies or TV shows in the sample were found to be likely infringing. We found seven of the 148 files in the games and software category to be likely non-infringing—including two Linux distributions, free plug-in packs for games, as well as free and beta software. In the pornography category, one of the 145 files claimed to be an amateur video, and we gave it the benefit of the doubt as likely non-infringing. All of the 98 music torrents were likely infringing. Two of the fifteen files in the books/guides category seemed to be likely non-infringing.

    • by Cyberllama (113628) on Saturday July 24, 2010 @12:55AM (#33011446)

      I don't think anybody will argue that Bittorrent is not a vector for piracy. It most certainly is. I think most will even go further and concede that its primarily used for that purpose -- but these studies try to convince us that this is the *only* reason that Bittorrent exists and that is just plain silly. There are so many biases at play in this "research" that I almost don't know where to begin.

      I am not familiar with the prior Princeton study so much, but this more recent one is problematic in that they used a "random" selection of the "most actively seeded files". These are actually contradictory terms. Either the sample is random, or its comprised of the most actively seeded files -- to say that its a random sampling of a non-random subset is misleading at best.

      Anyone who's ever looked around on a tracker knows the real percentage is much higher. There's TONS of self-published material all over bit torrent particularly in the music and ebooks categories. While most of the ebooks might well be what most of us would consider "spam" ("Make $10,000 dollars in 7 days!"), they are almost certainly not copyrighted material in the sense that we would think of it. There may actaully be some copyright asserted, but I doubt any of these have been properly submitted to the library of congress and their authors quite clearly intend for you to distribute them.

      Speaking of files you are intended to distribute, you also see quite a few game patches, service packs and other large files hosted on bittorrent. For instance, there's probably 100 torrents on the Pirate Bay right now that are just iPhone firmwares. While these may be technically still copyrighted material, they are *intended* for distribution. Simply being under copyright does not mean a file is not meant to be shared. In fact, some companies distribute their patches via bittorrent directly, such as Blizzard, but the trackers they use are almost certainly not included in this study. In fact, there are trackers that deal exclusively in legal-to-distribute content and they are clearly excluded from these sorts of studies. This further increases the bias in the results.

      Moreover the are the more murkier issues of international laws. What is copyrighted in the United States can easily be public domain somewhere else. The internet does not know geographic boundaries, so establishing the legality of a file is almost never going to be a black or white issue.

      • I don't think anybody will argue that Bittorrent is not a vector for piracy.

        Aw, horseshit. That's exactly what Slashdot has been arguing for years - "it's a method of sharing files that just happens to also be used for illegitimate purposes by some" is how the argument has long been phrased. Just look at the highly rated comments in the discussion, like yours, each one arguing how it's just not possible that they study is accurate. It must be spin. Etc... Etc...

        While most of the ebooks might wel

        • by jonbryce (703250) on Saturday July 24, 2010 @04:48AM (#33012130) Homepage

          You could do a study of files hosted on Rapidshare and conclude that Internet Explorer is primarily used for piracy.

        • Re: (Score:3, Insightful)

          by bzipitidoo (647217)

          And copyright law is a couple of decades behind reality.

          Admit it, copyright doesn't work. Sharing can't be stopped. Criminalizing the behavior has served no good purpose, and certainly hasn't accomplished anything. The system by which authors are compensated needs radical reform. Stop beating up on everyone for "piracy". Why? Because sharing should be legal. And then methods for sharing wouldn't automatically be suspect.

          You write as if the authors and users of P2P programs were deliberately fomen

  • I find 100% of money spent on this study definitely wasted.
  • Wow! (Score:3, Funny)

    by L4wNd4rt (1863080) on Saturday July 24, 2010 @12:26AM (#33011276)
    You're kidding me, it's that high...wow!!
    • Re: (Score:3, Insightful)

      by Mashiki (184564)

      It would be higher if they were doing it from a country where TV/music sharing was legal.

  • by Triv (181010) on Saturday July 24, 2010 @12:29AM (#33011296) Journal

    "The total sample consisted of 1,000 torrent files--a random selection from the most active seeded files on the trackers they used."

    Most Active. Charming. It's almost like saying, "of the 1,000 most illegal torrents, almost 1,000 of them are illegal." I want to know about the millions of other files on BT, not the ones most likely to be illegal. Also: 1,000 randomly selected out of how many of the most active torrents?

    Bad study is bad, or at least bad press release is bad, and I can smell the spin from 5,000 miles away.

    • by bbqsrc (1441981) on Saturday July 24, 2010 @12:33AM (#33011320) Homepage
      This [blogspot.com] article explains why the above poster is correct.
    • by cloricus (691063) on Saturday July 24, 2010 @12:45AM (#33011390)
      I applied their study methodology to sex in a status update.

      If you only look for sex statistics in brothels you'll only find prostitutes and from that information you can be sure that 99.7% of all human sex is paid for.

      As you can see it is sound and the results are rock solid!
    • How would you do the study differently? Tracking down an unbiased sample of all torrents would be a nightmare, and even then, a sensible study would weight based on activity. We are, after all, looking for data on how bittorrent is actually used, so their methodology isn't the worst way they could have gone about it.

      • by Trepidity (597)

        I agree, but only if we restrict the results of the study to talking about something like "popular bittorrent trackers" rather than "bittorrent" full stop. If you're talking about bittorrent as a technology, you really have to include all the various things distributed from 100%-legal-files-only trackers (which I doubt they included in their study), like this one [debian.org].

    • by T Murphy (1054674)
      Well to give them the benefit of the doubt, this isn't a bad study for estimating what percent of torrenting is legal. Even if most files happened to be legal, most of the traffic itself is sharing illegal files. To me that is a far more useful metric than what percent of files out there are legal.
    • by abigsmurf (919188)
      Most seeded torrents is the only fair way of doing it.

      If you randomly picked out of any and all torrents, they'd likely find that 95% of then have 0-5 seeds.
  • by kurokame (1764228) on Saturday July 24, 2010 @12:30AM (#33011300)

    infringing torrents :: ambiguous :: legal

    porn :: probably porn :: normal content

    spam :: probably spam :: real emails

    blog posts :: lazily disguised reposts :: real news

    fake google results :: crappy sites :: what you were actually searching for

    And so forth...within a small margin, this appears to be the standard ratio of the internet.

  • I think it's about time they legalize piracy.
  • !random (Score:2, Interesting)

    The summary states:

    The total sample consisted of 1,000 torrent files—a random selection from the most active seeded files on the trackers they used.

    Clearly then the sample isn't a random subset of 'all torrents' but instead of 'popular torrents on certain trackers.' This does not justify the proposition in the title "Study Finds 0.3% of BitTorrent Files Definitely Legal."

    That aside, fat chance I'm going to trust The Internet Commerce Security Laboratory to keep their science unbiased in this regard
    • by Americano (920576)

      Seriously, for whom would a sample size of 1,000 torrents seem even close to enough?

      Statisticians?

      You do not need a huge sample size to extract statistically significant results. Of course their sampling methodology could lead to skewed results due to a selection bias, and that's always worth considering... but a sample size of 1000 is not too small to derive statistically significant results from, assuming they used a reasonably "random" sampling methodology.

  • by Dwonis (52652) * on Saturday July 24, 2010 @12:44AM (#33011384)

    Okay, I used to use BitTorrent for downloading Linux and a bunch of other things, rather than downloading directly from mirrors. Do you know why I don't know? Because Bell Canada throttles BitTorrent traffic, but not plain HTTP and FTP traffic.

    Those bastards broke legitimate uses of BitTorrent, and now they complain that only pirates use it.

    • Bit Torrent for pirate movies and tv shows etc., is a complete waste of time anyway. There are FAR better ways to get that kind of stuff.

  • I don't think this study is about legal files; I think it's about files that are legal to distrbute freely. There's a big difference; illegal files would be ones illegal to possess, period. Unless copyright law has changed, it covers distribution, not possession. Pedantic point, maybe, but illegal files really does refer to something, but not merely copyrighted works whose authors don't allow free distribution.
  • Selection bias (Score:4, Interesting)

    by munky99999 (781012) on Saturday July 24, 2010 @12:51AM (#33011426)
    0.3% chance this report isnt selection bias. Only 1000 torrents? Only 23 trackers? Why not 25? Was those extra 2 going to destroy your stats? How about 1 million torrents, taken from a specific date in time; over as many trackers you can find. http://wiki.vuze.com/w/Legal_torrent_sites [vuze.com] Omg I did 250,000 torrents and only went to the above link for 29 trackers. New article: Study analyses 29 trackers, more then previously, finds 100% torrents legal.
    • by shish (588640)

      http://wiki.vuze.com/w/Legal_torrent_sites Omg I did 250,000 torrents and only went to the above link for 29 trackers. New article: Study analyses 29 trackers, more then previously, finds 100% torrents legal.

      Looking down that list, one tracker jumps out as being one of the largest old anime fansub trackers (the copyright holders don't live in the USA, but that doesn't make it legal...); and unless Disney have recently started giving things away for free, I'm pretty sure some of their stuff has been uploaded without permission too...

      Perhaps the reason the original study was so small was because they wanted to actually *do* the study, rather than just making shit up as evidence that their personal opinion is rig

  • a random selection from the most active seeded files on the trackers they used. Each file was manually checked to see whether it was being legally distributed.

    Note "from the most active seeded files"

    In other words, this doesn't really mean that only "0.3% of BitTorrent Files" are definitely legal.. far more might be legal but not among the top active torrents.

    That could mean there are plenty of legal torrents, but they don't make the list of top active ones, because (perhaps) illegal ones are mor

  • Unfortunately, I cannot find a link to the actual study, so it is impossible to tell if their methods really are accurate. However, I do believe that only 1000 of the most highly seeded files is not an accurate representation of all BitTorrent traffic. In fact that very requirement that they be the highest seeded sets up a bias within this study. A study of 100,000 randomly selected files from as many trackers as they can find would yeild far more acurate results.
    Who knows? Maybe this is what they did ori
  • by chub_mackerel (911522) on Saturday July 24, 2010 @01:31AM (#33011566)

    "Copyrighted" refers to the work. "Infringing" refers to the *use* of the work. The first does not imply the second.

    The aricle says they checked "...whether the file was confirmed to be copyrighted..." And then apparently made the jump to assuming that anything copyrighted must be illegal, sliding immediately into called them "infringing files."

    Of course by that metric all the Linux distros are illegal as well since they too are "copyrighted." As is any blog post, web page, or photo taken in the last, say, 70 years. As is anything that is shared properly according to the terms of any license. Now the study may have actually looked at the license terms in place for each work, but this definitely not what the article *said*.

    Not to mention that regardless of any express license terms, sharing that qualifies as fair use is also NOT AN INFRINGMENT and is LEGAL and should not be described as illegal or as "infringing files."

    Any indication whether these types of things (terms of the licenses according to each item, whether the sharing events qualified as fair use) were taken into account? If not, then I'd counter by noting that 100% of the material on Warner Bros' home page is copyrighted too. Should I say it's being shared "illegally"? Of course not, but my whole point is that if you play with semantics loosely enough, you'll find that probably the vast majority of the material on the Net as a whole is "illegal" and "copyrighted."

    *grumble*

    • by LoneHighway (1625681) on Saturday July 24, 2010 @01:47AM (#33011602)
      Agreed. There are a lot of audio books on torrent that are copyrighted, but out of print. What are you infringing if you can't buy the file at any price?
    • Alright -- to respond to myself --it does look like the researchers did some sort of manual license checking for each commonly-shared work, but the article is pretty silent on what, exactly, that entailed. I'm virtually certain it didn't involve checking for fair use possibilities.

      I'm curious as to how the same logic would have described the simple use of a VCR prior to the Sony case: "100% of material recorded on VCRs is copyrighted and definitely illegal." All copyrighted, yes, but much of the recording

    • Not to mention that regardless of any express license terms, sharing that qualifies as fair use is also NOT AN INFRINGMENT and is LEGAL and should not be described as illegal or as "infringing files."

      For that very limited subset of files which will qualify as "fair use", sure. But outside of files licensed to be freely shared or in the public domain, that subset is so small as to be non-existent.

      my whole point is that if you play with semantics loosely enough, you'll find that probably the vast maj

  • by Animats (122034) on Saturday July 24, 2010 @02:08AM (#33011678) Homepage

    The trouble with the "peer to peer" systems today is that they're horrendously inefficient ways of transmitting the same data around. It's gotten better, but still, the same data passes back and forth across intercontinental undersea cables multiple times.

    Many years ago, when I was going to school in Cleveland, I stood on an overpass and watched two coal trains passing each other, in opposite directions. And I thought that some day, computers would be smart enough to get the owners of that coal in touch with each other so they could cut a deal and avoid the wasted transportation. And indeed, that happened.

    But now we have the same huge data files passing each other, in opposite directions. This is lame. Especially since USENET got it right. If the "peer to peer" systems weren't so focused on piracy, they could work much better.

    • Re: (Score:3, Insightful)

      by amentajo (1199437)

      Squid [squid-cache.org] sounds kind of like what you're trying to get at. It's a web proxy for HTTP/FTP. Frequently-requested pages are cached locally, so if an ISP runs it, then they can avoid querying out to the wider Internet and avoid all the extra hops associated with that.

      It could probably be extended (heck, maybe some ISP privately has, or done similar work thereof) to include the BitTorrent protocol: each torrent has a unique identifying hash, so it's theoretically possible for an ISP to monitor a swarm and cache e

  • I can download Total Commander from author's site.

    I can download Total Commander (with added files, which do not modify original Total Commander files) from torrent sites as well.

    If I download it from torrent site, will this study consider it as a piracy?

    This study is flawed beyond comprehension.

  • Is it also true that only 0.3% of VHS tapes contain legal content when it was at its peak ?

    And i heard a lot of those old audio tapes(cassete recorders) had content that was just copied from other tapes (tape-to-tape they called it), people used to take them to concerts and release "bootleg" recordings.

    How the industry has continued to survive with such blazen disrespect for the laws surrounding th music(ians) they love is beyond me.

    Perhaps it would appropriate to start an appeal so we can all donate money

  • Used to be, artists wanted to be heard. Nowadays, all these newfangled "artists" just want their pocketbooks expanded real easy like. Not sure who to blame, really. You might call 'em greedy, but you might say good vibrations don't fill an empty stomach, nor put a roof over your head. Might be a time they did. Not this time.

  • Cloud storage (Score:4, Interesting)

    by IndustrialComplex (975015) on Saturday July 24, 2010 @04:27AM (#33012060)

    I use bittorrent as a bit of a poor-man's cloud storage.

    I've got a ton of CDs I've purchased, and after a flood and a series of moves the HDs where I stored the ripped (low quality) MP3s were destroyed.

    So now whenever I want to listen to a CD that I've purchased, I just download the CD using bittorrent, usually as FLAC, and add the FLAC files to the library I'm rebuilding. I don't have to worry about setting up the ripping software, and I'm actually getting it a bit better organized this time.

    So for me, that 'illegal' content is just me rebuilding my digital copies of CDs or DVDs I legally own.

  • "it's clear that Linux distros weren't exactly dominating the charts here"

    I haven't really paid that much attention in the past, but just checking Debian now they have their own tracker, and I suppose many of the other Linux distros could be using trackers not on that list of 23. If all the major distros use their own trackers, then obviously most Linux bittorrent traffic wouldn't be on public trackers, and that statement is ludicrous.

  • ...There is a problem with the law, not with everyone. Laws where supposed to keep some social contracts working - like not running around killing everyone, paying taxes to support commons etc. When everyone is breaking the law - that means that the law does not reflect current situation in a society. Either this - or you have a tyranny where the minority dictates everyone what to do.
  • Looking at the study it immediately appears to be fundamentally flawed by the simple fact that the trackers analysed were in fact pirate trackers? What on earth did they expect. I'm actually quite surprised that there was even 0.3% legit content shared. If this test was to have been conducted properly they should have;

    Sampled traffic at ISP using DPI to look for torrent data
    Sampled from several ISPs
    Sampled in multiple geographic locations

    Not go to a well known warez tracker and click sort by most seeded and

I've got a bad feeling about this.

Working...