Major Flaws Found In Recent BitTorrent Study 167
Caledfwlch writes with a followup to news we discussed a couple days ago about a study that found only 0.3% of torrents to be legal. (A further 11% was described as "ambiguous.") TorrentFreak looked more deeply into the study and found a number of flaws, suggesting that the researchers' data may have been pulled from a bogus tracker. Quoting:
"Here's where the researchers make total fools out of themselves. In their answer to the question they refer to a table of the top 10 most seeded torrents. ... the most seeded file was uploaded nearly two years ago (The Incredible Hulk) and has a massive 1,112,628 seeders. The torrent in 10th place is not doing bad either with 277,043 seeds. All false data. We're not sure where these numbers originate from but the best seeded torrent at the moment only has 13,739 seeders; that's 1% of what the study reports. Also, the fact that the release is nearly two years old should have sounded some alarm bells. It appears that the researchers have pulled data from a bogus tracker, and it wouldn't be a big surprise if all the torrents in their top 10 are actually fake."
They also take a cursory look at isoHunt, finding that 1.5% of torrent files come from Jamendo alone, "a site that publishes only Creative Commons licensed music."
Honestly... (Score:5, Insightful)
Re:Honestly... (Score:5, Insightful)
It probably surprises the people that thought they could get away with presenting bogus data. ;)
Re: (Score:2, Insightful)
Probably not. They probably got paid whether the misinformation eventually gets called out or not. I'm sure they are quite happy with the mileage they got out of their "study."
Re: (Score:2)
Bittorrent isn't illegal, but does anyone honestly believe that at least 75% of all torrents or torrent traffic isn't accounted for by copyrighted material illegally being mass-distributed?
Re:Honestly... (Score:5, Insightful)
Problem is, most people who visit this site already know what this article is stating. They knew the study was bogus from the start because they are more in tune with torrents than the people doing the study. The issue arises when the "Recent Study" slamming torrents makes the 6:00 news and it makes a nice segway into how to combat piracy - however this article, showing that the data was incorrect and that they are either embellishing or straight up lieing, will get no mention on mainstream media whatsoever. The people who need to see this news won't see it, and the people who see this news already know. More tragic than ironic.
Re: (Score:2, Insightful)
Segway - 2-wheeled self-balancing electric vehicle.
Segue - smooth transition to another topic.
Sheesh -- exclamation of disapproving disbelief, usually low-volume.
Re: (Score:2)
Another reason people love and hate English Simultaneously: Homonyms.
Re: (Score:3, Funny)
That's God damn it, God damn it! :p
Re: (Score:2)
Re: (Score:2)
You're forgetting the pantheists.
Re: (Score:2)
Shades of Terry Pratchett! That would be "God(s(less)) damn it", remember the athists out there. They're the most fanatical of all religions. Well, maybe not more fanatical than the Muslims, Scientologists, or FSMsters. Especially the FSMsters, don't let them catch you with a can of spagettios or they'll make the Muslims look like pacifists!
Re:Honestly... (Score:4, Insightful)
And you didn't catch segue?
Re:Honestly... (Score:4, Interesting)
No, because most tech people instinctually know that filesharing is ethically right, and the rest don't care for facts either way.
Re:Honestly... (Score:4, Insightful)
If the top 10 files were fake, they were not illegal. So by far most of the popular torrents are legal?
Re: (Score:3, Insightful)
No surprise here, Theoboley
In fact for the last few years I've questioned that glaring absess on the face of science " the study".
( anon cow costume on for karma protection from the guilty, misled, clueless and those incapable of unbiased view due to vocation, religion or dementia)
Let's face it, a study is different than full blown research. Oddly enough though an article on a "study" will send the public off in dizzying new directions, convinced that physics has new rules, Bioscience has the cure for "fill
Re:Honestly... (Score:5, Informative)
The report gave the percentage of legal torrents as so low that some CC music site alone exceeds their entire sum of legal torrents on the entire internet. That doesn't mean that really only 98% of torrents are illegal, that means that their dataset is ludicrously inaccurate and the entire study is completely invalidated.
Who modded this interesting?
Re: (Score:2)
We're not trying to argue that the majority of the torrents are legit...
Are you? My whole point was, who cares about this study (or about far more ludicrous claim that 80% of torrents are legal by one of the Pirate Bay founders). The reality is that the vast majority of torrents are illegal. Are you disputing that?
Re: (Score:3, Interesting)
Re: (Score:3)
That "pointless" pedantry may yet work in The Pirate Bay trial.
Re:Honestly... (Score:4, Informative)
For every torrent with a 1,000 seeds, there are 10 fake ones with 10,000 seeds each. Since fakes don't contain copyrighted information, they are not illegal. So for every 1,000 illegal seeds, there are 100,000 legal ones. Therefore less than 1% of torrent seeds is illegal. ;)
Re: (Score:2)
Yeah, I'll dispute that.
I've never used bittorrent to download something illegal. I don't believe that I know of any occasion where someone has so used bittorrent.
I have no trouble believing that may people do use it illegally, but I really doubt that most use is for illegal activity. (Mind you, it's just doubt. But all of my personal knowledge is of legal activity. So I'll need actual verifiable evidence before I'll believe otherwise.)
Re: (Score:2)
Well, how do you measure use, by number of files or quantity of data (number of MiBs [wikipedia.org] or whatnot), or even number of torrents? Since most distros have torrents, measuring by data could make a huge difference (since distros are mostly 700 MB to fit on a CD, which is a lot huger than your average mp3).
Re: (Score:3, Interesting)
The thing that surprises me is that - given the facts which your already pointed out - someone would actually bother to fake the data. I always figured that 90%+ of torrents were illegal, so why would anyone conduct a fraudulent study and run the risk of being exposed, just so they could get a few extra percentage points? It makes me question my basic premise - maybe there ARE more legitimate torrents than I'm aware of.
Re: (Score:3, Insightful)
It's a difficult sort of study to do properly for a few reasons so unless there is strong evidence otherwise (e.g. funding from big media) I'd expect this was simply a case of incompetance.
Reasons why it's a difficult sort of study
1: If you actually download the files to investigate them then you are getting into legally dodgy ground. If you want to download at more than a trickle you will have to upload too which puts you in an even worse position legally.
2: Afaict most legal torrents use their own tracker
Re: (Score:2)
This is exactly why we need an accurate study. No one here is claiming that 100% of torrent files are legal and no one would believe that either. The question is why when they say nearly 100% of torrents are illegal are you so ready to believe such a ridiculous claim?
With any type of study you need to q
Re: (Score:2)
I think you might have missed something there. Or, at least missed it's significance.
"unconfirmed legal"
What does that mean, exactly? If I have money in my wallet, and a cop asks to see it, do I have to verify in some manner that it was "legally earned"? The cop documents it as "unconfirmed legal tender"?? I mean, WTF???
Torrents are either legal or they are illegal. Creating a new category "unconfirmed legal" assumes the need for some official to confirm that the torrent is legal, or it will fall into
Re: (Score:3, Insightful)
It means they supposedly couldn't figure out the copyright status of the torrents in question. TPB hosts legal and illegal stuff, so it might plausibly be hard to tell. You still need preponderance of the evidence to go after someone for copyright infringement.
Re: (Score:2)
in short... (Score:2)
...good we know who did (paid) the study.
Lets simply go seeding instead of this discussion.
Move along (Score:2, Insightful)
The best-seeded torrents... (Score:5, Interesting)
Re:The best-seeded torrents... (Score:5, Funny)
The patch that removes clothing completely from that game will bring the entire Internet to a standstill.
Re: (Score:2)
Like from blood elf paladins?
Please no.
Re: (Score:2)
I'm fearing naked Tauren and Draenei. But honestly, there are a great many humans that I would fear seeing naked, too. :p
Re: (Score:2)
So what will happen, when the patch comes out that allows characters to engage in X-rated activity?
Re: (Score:2)
That one's due on 12/21/2012.
Re: (Score:2)
If you mean standstill in the sense that nobody will be downloading it, then I agree.
Re: (Score:2)
Do you remember the "nudechick" skin in Quake II? I always considered that one a cheat!
Re: (Score:2)
http://games.softpedia.com/get/Mods-Addons/World-of-Warcraft-Nude-Patch.shtml [softpedia.com]
And the internets are still alive :)
Imagine that (Score:5, Insightful)
Industry group ending in 'AA' pays to have study conducted that supports their views, doesn't care so much about accuracy.
News at eleven.
Re:Imagine that (Score:5, Funny)
News at eleven.
I've got plans tonight and won't be home to catch the news at 11. Can someone upload a torrent for me?
Re: (Score:2)
Dammit, I was only planning on watching the 11 O'Clock News tonight!
(Also, if you've ever used a torrent, you should know that claiming that it has more than, say, 10,000 seeders is almost certainly BS.)
Moral Of The Story (Score:3, Funny)
Moral of the story.... don't trust seedy research.
Re:Moral Of The Story (Score:5, Funny)
Seedy research can plant misinformation.
Re: (Score:3, Funny)
But seedy research downloads more quickly!
Old content is interesting... (Score:5, Insightful)
One major problem with Bit Torrent is that you only get easy access to what is "popular" at any given time. I've gotten some TV show episodes (not available in the US) downloaded in a reasonable amount of time when I start the download within 24 hours of the original show being aired... but try to get the same episode 30 days later and availability drops in a hurry. Despite all the pro-P2P propaganda about how it "democratizes" data, it's really more a mob-rule popularity contest for grabbing the shiniest download.
Re:Old content is interesting... (Score:5, Insightful)
Get on a better site.
Re:Old content is interesting... (Score:5, Funny)
Re:Old content is interesting... (Score:5, Informative)
Two peoples private machines sitting there serving only you unpopular content for free out of good will isn't enough for you? 2 seeders is plenty, especially with hard-to-find content.
Re: (Score:2)
Seriously! The kids these days! Sheesh...
Re: (Score:2)
I've been trying to get the last episode of Star Trek: Voyager for two months now. The station it played on changed networks right as the last season started, and I missed the entire last season. I can only find DVDs for the first two seasons at WalMart, and they're rediculously priced. I only want that one damned show! I'd be happy as hell to find one seeder, let alone two. Hell, I'd give twenty bucks for that one episode if I could buy it, even though I just paid only five bucks for two Cheech and Chong m
Re: (Score:2)
Are you serious? There are over a hundred seeders on the complete series torrent alone, according to isohunt. There's even a series 7 only torrent for people too stupid to realize you can select which files you want.
Really? (Score:2)
Oh Shazbot!
Re: (Score:2)
Re: (Score:2)
Better yet take off and nuke the site from orbit. Only way to be sure.
What would you recommend? n/t (Score:2)
Re: (Score:2)
something invite only or limited signup, has seeding requirements, and/or is focused on the type of content you are likely to want. I'd post links to places I know, but the first rule of fight club is....
Re: (Score:2)
So, considering
there isn't anything out there right now that goes over 1% of the popularity of the Hulk in it's prime? I somehow find that hard to believe.
Re:Old content is interesting... (Score:4, Insightful)
Re: (Score:2)
Depends on the content. If it's something popular, chances are people will keep sharing it. Or reshare it if the original torrents disappear.
I just downloaded a few TV shows that aired 10, 14 and 21 years back, myself.
Re: (Score:2, Interesting)
...really more a mob-rule popularity contest for grabbing the shiniest download.
Right. What they said. It democratizes data. The data with the most popular support has the most popular support.
That means that data people no longer cares about gets lost to time. Of course, it only takes one person out there to keep that data alive. It may be slow, a little harder to find, and the connection to it may be less robust, but it's still there.
It also means if you get a community of people who don't want to see old TV and movies die, then everyone only has to host one or two shows and everyth
Re: (Score:2)
all of this is solved by finding a tracker with seeding/ratio requirements.
Re: (Score:3, Informative)
This is actually not true. I've often been downloading TV series and movies from the 60's, 70's and 80's, things I would never see on today's Television channels but bittorrent allows me to watch. Think of any tv show you liked as a child (or your father liked as a child), be it Star Trek (the original series), Little House on the Prairie or whatever - and you can watch it on bittorrent.
Re: (Score:2)
Really? I thought there was at TV stations dedicated to older TV shows (TV Land [wikipedia.org]) and movies (Turner Classic Movies [wikipedia.org] and AMC [wikipedia.org])... assuming you live in North America that is.
Heck, you can watch Star Trek TOS episodes directly on CBS's site last I checked.
Re: (Score:2)
Assuming you live in the USA (or can use a proxy to pretend you are)
But, to the GP's point, I am part of one of those small interest groups that has a few gigs of HD being used to keep some old, relatively obscure British shows alive.
There's no way I (in North America) would have ever been able to see these without this technology.
And another bonus of seeding old obscure stuff, the **AA type orgs don't seem to put any effort into hassling people about it. It's essentially abandonware. (and that I'm
Re: (Score:2)
Re: (Score:2)
Check if you get free Usenet access with your ISP, or purchase a Usenet account from time to time to download some of the older stuff. Although, even with 600+ days of retention, content falls through gaps.
Keep on seeding/posting!
Re: (Score:3, Insightful)
Despite all the pro-P2P propaganda about how it "democratizes" data, it's really more a mob-rule popularity contest for grabbing the shiniest download.
Isn't mob rule exactly what democracy is all about? If there is little interest in a download then there will be fewer people seeding it. How else did you think democracy would work?
Green Thumb (Score:2)
The argument is merely about numbers. (Score:1)
Turns out it was actually 0.003%*. Sorry for the confusion.
*All the legal transfers were of Ubuntu ISOs.
any1 else smell the stench of MPAA on this? (Score:3, Insightful)
Yes, the original report was SOOOO flawed! (Score:1)
Torrents can be both legal and illegal at once (Score:5, Insightful)
Some country's laws may flag a torrent as illegal while other countries consider it as legal.
As an example, someone could be downloading a copyrighted song for backup purposes while owning a legitimate copy and these fools will automatically classify this kind of download an infringement.
Re: (Score:2)
Which is why an exemption that specifically allows you the right to back up your own CDs and move them onto whatever devices you own can trump your argument. If law enforcement is such that it is not illegal for me to do day-to-day activities on my computer with media I've legally acquired, then shutting down public trackers that make almost exclusively copyrighted material available is a possibity.
I don't have a problem with this scenario. The exemptions provided to the DMCA today allow me to enjoy my medi
Re: (Score:2)
Re: (Score:3, Interesting)
This goes for games, too. I've used BitTorrent to download another copy of lots of games I payed for long ago.
Another factor to consider is that a pirate isn't necessarily concerned with backing up what they've downloaded. I know I've downloaded some games numerous times because I don't really worry about backups anymore if I don't have the original copy. Therefor, the number of times a torrent has been downloaded may give an inflated estimate of the number of pirates.
Re: (Score:3, Insightful)
Re: (Score:3, Interesting)
Been doing the same with my old vinyl, as being a lot less trouble than acquiring the hardware to rip it myself (tho sometimes they're damned hard to find). The end result is the same -- I have MP3s of vinyl that I've already paid money for.
Re: (Score:2)
lol, cd's were converted by me long ago, and i'm not 99% the way though my dvds. Down with shiny plastic disks!
Re: (Score:2)
Yeah, first thing I do with a newly-acquired CD is rip the blasted thing... easier to point at MP3s on the PC and let WinAmp have at, than to drag the CD off the shelf, muck about with the case and the drawer and THEN a player, and worry about scratching it along the way. And more versatile since it's way easier to mix-and-match to the taste of the moment. I mean, who wants to be swapping 50 CDs in and out to get the songs you want at the moment, when you can do it in a few clicks with MP3s and a playlist?
S
Re: (Score:2, Interesting)
I had the inverse of your problem - I've got the physical CD for Diablo, but the CD key is nowhere to be found. Was it in the original box or manual? Maybe it's at my parents' house (if it wasn't thrown out years ago)?
I used a CD key from a list I found on the net. I don't think it's fair to have to buy the game again because I lost a stupid piece of paper. DRM sucks :-(.
TorrentFreak? Really? Consider the source. (Score:1, Troll)
We are supposed to believe the analysis of a biased entity over professional researchers?
What, exactly, are TorrentFreak's and Ernesto's qualifications to analyze the data? Do they have education or degrees that include statistical and/or numerical analysis of data?
Or, did they read it and decide that it can't possibly be true because they don't like the results? Is it not in their best interest to promote the idea that the study is flawed?
Re: (Score:2)
The study wasn't a study at all. Torrentfreak at least knows more about BitTorrent than the people that made the report, if they were actually looking for a result that they hadn't already decided.
Re: (Score:2)
Torrentfreak didn't make or own BitTorrent. A closer analogy would be Gamespy commenting on a study of video games.
Re:TorrentFreak? Really? Consider the source. (Score:4, Informative)
Re: (Score:2)
His problem is not with the analysis, it is much more fundamental than that.. They are using obviously bogus data, so whatever analysis they perform on them their results are rubbish.
The car analogy equivalent: We researched current auto sales to figure out the 2010 market share for auto companies, and found out that Dodge is leading with 95% of the market. 99% of Dodge sales come from a single dodge dealership whose owner Mr. Al Coholic, informed us they sold 50 million dodge cars from January 2010...
Re: (Score:2)
If you could comprehend the facts you would see that my analogy was perfect.
Perhaps you did not read the study?
Their data includes such gems as a 2-year old dvd release having over 1 million seeds! Obviously if I included data from one dealership that claimed to have sold many times the entire US sale volume just by themselves, even if I included all other "truthful" dealerships my results would be useless.
So, is there a research by Princeton that claims there are torrents with over 1 million SEEDS? I dare
Re: (Score:2)
You still don't understand what it means for a torrent to have over 1 million ACTUAL SEEDS???
The "2 year old" is just an extra, but it is proof you haven't read the report: IN THE SAME LIST there are torrents from 2010. Yeah, 2 year old data, that's the only problem...
Instead of calling other people close-minded zealots, just admit you have no clue what the f*ck we are talking about and climb back to your cave.
And for the record, I never said I believe there are more legal torrents etc. I.e. I have no idea
Re: (Score:2)
Hey.
The problem is that the raw data for the analysis shows torrents that have several orders of magnitudes more seeders than I have seen on ANY torrents in the last 8 years. The highest I've seen is around 23.000 seeders.
Now, you probably won't just believe my word for it, so let's look at (arguably) the most famous and most used tracker, Pirate Bay. On their top100 page [thepiratebay.org], the highest one have 17.000 seeders, and most of them just have a few thousand. And this is the most popular content on the most popular
Re:TorrentFreak? Really? Consider the source. (Score:4, Insightful)
We are supposed to believe the analysis of a biased entity over professional researchers?
When the professional researchers conclude that "Music, movies and TV shows constituted the three largest categories of shared materials, and among those, zero legal files were found", we have to conclude that they didn't do a very good job, because there are at least two sites (Jamendo and Etree [etree.org]) which allow nothing but legal music files, and both have tracked the exchange of many petabytes of data. (There are many more sites which limit themselves to legal material, but not to music--or TV or movies.)
If I were to do an analysis of FTP, and then deliberately limited my study to "pirate" sites, I would come up with a hopelessly biased sample and useless numbers. It may well be that the legal torrent sites are statistically insignificant, but if they didn't study them, how can they conclude that? Assuming that they are is basically assuming your conclusion. It begs the question.
I agree with your assessment of TorrentFreak, but a lack of credentials and credibility in a critic does not make a study legitimate.
Re: (Score:3, Interesting)
They checked sites which could contain infringing data. You suggest that check sites where they are guaranteed not to find infringing data. Which is data set is going to be more biased?
That would be fine if they framed it as follows: "Although numerous sites exist for the legitimate exchange of legal software and other data via torrents, sites which allow the option of both infringing and non-infringing data are much more likely to contain infringing data."
Here's how it is framed instead: "Only 0.3% of torrents are legal."
Re: (Score:3, Informative)
Blizzard uses torrents to distribute WOW patches.
Ubuntu, Eclipse, MySql, and more use torrents for the distribution of open source software.
For any widespread distribution of large files, bandwidth can become quite costly. Torrents are just about the best solution to reducing those costs.
The bad news... (Score:1, Insightful)
Guess which study the lobby groups (and consequently our politicians) are going to cite, and which one they will ignore?
It's too bad that there wasn't a way to attach this debunking to the original study, so that you would have to consciously ignore it. It will be really easy to lose these new findings in the shuffle.
Not news! (Score:2)
Response from the researchers (Score:3, Informative)
Ars technica [arstechnica.com] has actually asked the researchers about the issue. Here is the response from Paul Watters, one of the researchers:
Thank you for your enquiry regarding our research report "Investigation into the extent of infringing content on BitTorrent networks". As researchers, we not only stand by the findings that we have arrived at, but - having made our methodology public - we are providing other bona fide researchers to replicate and/or dispute our findings. Their results can in turn be assessed through the peer review process; this is the process that normal research activity takes.
You have raised some interesting points that are fundamental to the validitiy of any study in this area: the sampling strategy; verification of results and so on. We believe that our methodology was rigorously applied to the sample that we obtained. Over time, we will replicate the sampling process, so that we will gain better estimates of the population results. This is the fundamental tenet of statistical sampling.
Re: (Score:2, Informative)
Re: (Score:2)
A response as soon as this is a good indicator, but I have a feeling, even though the Ars article notes other interesting points, the seemingly bad numbers will be focused on since a torrent site brought that point. Mr. Watters, even though your response makes sense, only 0.3% of Internet readers will realize you're not ignoring obvious fallacies. It's funny how much the speed of the Internet detriments the time put into considering the facts that make up the context.
So what? (Score:3, Interesting)
Re: (Score:2)
This is actually a good point. Mod Parent up.
Re: (Score:2)
Torrents are not, in and of themselves, currently illegal in the USA. This study will be used to support attempts to change that, or at least to try to induce ISPs to block torrents.