Nine Out of Ten of the Internet's Top Websites Are Leaking Your Data 133
merbs writes: The vast majority of websites you visit are sending your data to third-party sources, usually without your permission or knowledge. That's not exactly breaking news, but the sheer scale and ubiquity of that leakage might be. Tim Libert, a privacy researcher, has published new peer-reviewed research that sought to quantify all the "privacy compromising mechanisms" on the one million most popular websites worldwide. His conclusion? "Findings indicate that nearly 9 in 10 websites leak user data to parties of which the user is likely unaware."
Re: (Score:2)
Re: (Score:2)
I thought the point of https was that isps and others could not decrypt it?
Re: (Score:1)
You know.../. tracks the fuck out of you. Try sSoylentnews.org (the Red site). No trackers. period. The people are nice, informed, and decisions are made on a community level.
Re: (Score:1)
Re: (Score:3)
Re: (Score:3, Informative)
https://soylentnews.org/ [soylentnews.org]
Your privacy matters
Your community matters
No trackers. Period.
also note: https
Re: (Score:2)
My impression was ISPs didn't look at much data because they potentially lose the safe harbor protections for copyright and other criminal acts their customers might engage in, but with some of the monitoring of usage type the lines may be a little blurred.
Has this changed in some meaningful way?
Re: (Score:2)
you trigger all sorts of three-letter agency attention that you don't want, because it's now considered a sign of possible criminal activity if you actually have the gall to protect your privacy.
This. People are considered criminals and engaging in suspicious activity if they try to arrange their lives so people can't develop dossiers on them, attach derogatory information on whim and then share that dossier with just anyone.
That's insane.
Ask anyone from any dictatorship - and I have- especially read histor
It makes you more secure (Score:2)
... requires you reveal information. The laws of physics aren't going to change for anyone.
Not only that, but a good portion of a leakage makes you more secure and is better for the user. How many millions of sites have a facebook login option? So Facebook can see your IP from that... because your browser is loading their javascript.
Would you really rather have a million copies of that javascript file out there that don't get updated when Facebook discovers a vulnerability or improves a security feature? Let's pretend you're not *you*, the tech guy running noscript, but a normal user.
Re: (Score:2)
I use uMatrix, Ghostery, and often have Disconnect enabled but I'm not sure why. Coupled with AdBlock, yeah, they can track me but it's pretty limited. Scripts only get loaded when I say they get loaded. I don't see ads. uMatrix is like NoScript but more easily refined. It's like an early version of Outpost Personal Firewall for your web browser. Between that, a VPN, and a remote connection to my home PC, I'm even comfortable using the hotel's wireless.
Re: (Score:1)
Doesn't matter... visit panopticlick on EFF.org's website, even with those tools, your browser will be found unique.
Who needs cookies when the Web browser hands over a fingerprint, and your ISP adds X-ACR headers to every HTTP transaction?
Re: (Score:2)
Actually, I don't know who it is but I'm no longer completely unique on panopticlick. I was, for quite a while, but that's changed. There were a whole two of us, I suspect the other one was also me. Either way, that's fine - or minimal. It's the site that gets that. I don't want it being spread to every other data collection service - which is why I use uMatrix.
wrong term (Score:2, Informative)
Not leaking it so much as shooting out of a firehouse.
Re: (Score:2)
leaking also would indicate that it is unintentional. the websites are mainly leaking through ad networks and are doing it on purpose to get money(or analytics).
Nine Out of Ten of the Internet's Top Websites... (Score:2)
... don't have my data.
Re: (Score:2)
... don't have my data.
And you think Slashdot doesn't share it for some reason? Don't give me this "they didn't say they would share" excuse...
If you do ANYTHING on the big "I" net, you are giving up information, like it or not... It's worse for you, you are posting on Slashdot for Pete's sake....
Re: (Score:2)
Re:Nine Out of Ten of the Internet's Top Websites. (Score:5, Informative)
And you think Slashdot doesn't share it for some reason?
Ghostery is blocking the following on Slashdot:
Doubleclick (advertising)
Google Adwords Conversion (advertising)
Google Analytics
Janrain
Scorecard Research Beacon
Taboola
It's on Slashdot, and everywhere else.
Here's a quote from TFA:
Most troubling is that if you use your browser setting to say 'Do Not Track' me, the explicitly stated policy of nearly all the companies is to flat-out ignore you
What we need is 9 out of 10 users to start explicitly blocking tracking and advertising, and then flat-out ignore the companies who complain about their bottom line. That article from the advertising industry group talking about how they screwed up rings a little hollow when they are obviously not interested in respecting the requests of consumers to not track them. Enabling Do Not Track is fine, but that only works with the good actors. For everyone else, see below.
https://www.ghostery.com/ [ghostery.com]
https://www.ublock.org/ [ublock.org]
https://adblockplus.org/ [adblockplus.org]
Re: (Score:1)
Doesn't Adblock/Plus whitelist companies that pay them?
Re: (Score:2)
As far as I know they give you the option of seeing "trusted" ads (or whatever the terminology is), but last I knew they ask if you want to enable or disable that during setup. At this point I don't think they're turning it on without telling you, and they don't hide the option to turn it off.
Re: (Score:3)
What we need is 9 out of 10 users to start explicitly blocking tracking and advertising, and then flat-out ignore the companies who complain about their bottom line.
I'll tell you exactly what sort of response that would evoke from pretty much everyone, because I've already seen it: They start moving actual content and functionality for their sites to the same servers that are serving ads and things to track you, leaving you with two choices: accept their ads and tracking, or don't use their site at all. What's your response going to be when >90% of the Internet is denied to you, because you won't give in to their ads and tracking techniques? That's likely what's com
Re: (Score:3)
What's your response going to be when >90% of the Internet is denied to you, because you won't give in to their ads and tracking techniques? That's likely what's coming.
We'll have to find out what will happen when >90% of the internet sees large drops in their traffic. People in general are becoming more aware to ad-blockers, it's no longer relegated to niche Firefox extensions. That day is coming. I expect to see new revenue models, which may be a way to continue the tracking, e.g. you pay a monthly subscription to a single "content network" that provides access to thousands of sites if you're logged in, rather than paying sites individually. Obviously that parent
Re: (Score:2)
This. In addition, the ad networks like this because they can build a profile on you. I've never had an issue with a side bar or banner ad or whatever being served up from teh same machine as the content I am reading.
Of course, if it gets too bad, since 99% of my web browsing is *reading* I can go back to a plain old text based browser like elinks
Re: (Score:2)
There's an interesting browser for Windows users called OffByOne. I've not used it in years but it wasn't too bad for text-only browsing. I think it displayed pictures as an option. Scripting simply doesn't work in it. At least it didn't years ago. Google indicates it is still around.
Re: (Score:2)
elinks is better, it can be compiled with both mouse and image support...
Re: (Score:3)
We'll have to find out what will happen when >90% of the internet sees large drops in their traffic. People in general are becoming more aware to ad-blockers, it's no longer relegated to niche Firefox extensions. That day is coming.
Pretty much this. I've installed it on a lot of regular folks computers, usually after a demonstration of the difference in loading times enabled and disabled. I'm usually looking at them in the first place because of compliaints of slow loading.
And I'm pretty certain it is having some effect already, as a number of sites that I no longer ever go to pop up screens that tell me to disable my ad blocker software......
Umm no folks, you'll never have even the chance to infect my machine ever again. ESAD bab
Re: (Score:2)
I expect to see new revenue models, which may be a way to continue the tracking, e.g. you pay a monthly subscription to a single "content network" that provides access to thousands of sites if you're logged in, rather than paying sites individually. Obviously that parent network would be able to track which of its sites you're on because you need to authenticate.
Hmm, seems like what the ISPs are doing righ now. Your point is?
Re: (Score:2)
My point is that what I suggested is completely different from what ISPs are doing right now. When you host a site, does some random ISP pay you when their customer visits it? No? Then it's not the same thing, is it? What I suggested is more along the lines of what cable TV was supposed to be when it started, not ISPs that provide access to the internet in the first place. I don't expect an ISP to create what I'm talking about, that's not their job.
Re: (Score:2)
Re: (Score:2)
They start moving actual content and functionality for their sites to the same servers that are serving ads and things to track you, leaving you with two choices: accept their ads and tracking, or don't use their site at all.
I've already been experiencing this already, not so much because a site is commingling its content and ads, but because my suite of advertisement/tracker/flash blockers break a small portion of the internet. Specifically, I've noticed:
* forbes: I can never click past their "quote of the day"
* politico: the drop down menu bar doesn't work
* lots of sites have comment boxes disabled
* occassionally I come across a video that won't load.
So, my response: some sites just fall off my radar like forbes, but I don't
Re: (Score:2)
I'll tell you exactly what sort of response that would evoke from pretty much everyone, because I've already seen it: They start moving actual content and functionality for their sites to the same servers that are serving ads and things to track you, leaving you with two choices: accept their ads and tracking, or don't use their site at all. What's your response going to be when >90% of the Internet is denied to you, because you won't give in to their ads and tracking techniques? That's likely what's coming.
Good. Then I'll usse the ten percent of the sites that are left. Or not at all. Teh intertoobz are mighty damn sick these days, and are rapidly losing any semblance of usefulness. So if it reaches that point, then it will reach zero usefuness for many. Then business and the trackers will have won - sorta.
All I know is I already don't go to sites that demand I turn off my adblocker software.
Re: (Score:2)
Re: (Score:2)
What we need is 9 out of 10 users to start explicitly blocking tracking and advertising, and then flat-out ignore the companies who complain about their bottom line.
Yes, and this is part one of the strategy. Already, if I go to a site, and see "We see you are using an ad blocker. Please unblock to access our content.
NONONONONONO assholes! You can just go out of business for all I care. I just click back to where I was, and move on. If enough of them analyze how many people just say a collective "Eat shit mofo's!", that will be the first stage.
The second stage is to give them what they want. lots and lots and lots of data, all spoofed, all the time. Enough to make t
Re: (Score:3)
...which means it's failing to block ooyala.com, ntv.io, and rxpnow.com. You might want to get a better browser extension (such as RequestPolicy).
Re: (Score:1)
...which means it's failing to block ooyala.com, ntv.io, and rxpnow.com. You might want to get a better browser extension (such as RequestPolicy).
Privacy Badger from EFF catches them all.
Re:Nine Out of Ten of the Internet's Top Websites. (Score:4, Insightful)
Are you so sure of that? Are you actually taking steps to stop it? Are you verifying it?
Right now on Slashdot as I type this, there are 12 external domains being referenced, 8 of which want to run scripts. All of them are ad or analytics companies.
A massive amount of sites have references to the big ad sites (usually multiple), as well as references and/or cookies to social media sites ... which means a lot of ad companies trivially track you across sites, know where you visit, how often, and the pages you're reading.
Unless you are actively blocking this crap, and unless you're looking at the sites which are being blocked and adding which you've missed ... and clearing any cookies and shit they've added as you go ... you should really assume that these sites are seeing your data even if you don't subscribe to them or realize you're interacting with them.
You have to be fairly aggressively blocking this shit to believe those companies aren't seeing some of your data.
And, quite frankly, if you are aggressively blocking this shit, your friends and family are probably tired of you ranting about how fucked up the internet is. I know mine are.
The problem is so many people don't know this, and even if you try to tell them they don't care.
TLD's -- NOT SITES (Score:2)
Just skimmed the paper -- and it's talking about the "10 most common top-level domains" -- not websites.
Re: (Score:2)
That's not what the paper says, they aren't saying that .com is tracking you or leaking your data. It says that they ran their numbers on the entire pool of ~1 million sites, and then ran sub-analysis for the 10 most popular TLDs (plus edu and gov) - com, net, org, ru, de, uk, br, jp, pl, and in. Table 1 in the PDF shows those findings. For the entire data set, 9.47 external domains were contacted on average. Among those TLDs, Brazil had the highest with 11 domains on average, and gov the lowest with 3.
Gomer says surprise 3 times (Score:2)
It was clearly not a long-contemplated ethical conundrum for the bigger share of them.
Surprising news! (Score:3)
One out of ten of the Internet's top web sites doesn't leak your information!
Re: (Score:3)
Given that I'm not a social networking whore, I'm less worried since I likely am not using sites to start with.
That doesn't matter. If you go to 10 sites and all of them tell your browser to contact Facebook for some Javascript API, then Facebook knows that your browser visited those 10 sites. If you then identify yourself on any of those sites, like logging in to Amazon or Newegg or whatever, then now they know who you are (or at least who is using that browser) and can match that up with their database to know which sites you've visited and what else you've done online. You don't need a Facebook account for tha
Re: (Score:2)
That doesn't matter. If you go to 10 sites and all of them tell your browser to contact Facebook for some Javascript API, then Facebook knows that your browser visited those 10 sites.
That's what script disabling is for. Certainly the number of tracker scripts FB has is impressive. And the site better be damn good for me to turn them off. Something like Sophia Vergara taking a shower maybe. Other than that, if I can't see their ads or content, its their loss - not mine.
Re: (Score:2)
I want to know which one of the 10 is it?
Actually, what the researcher says is that 9 out of 10 websites leak information about who visits them to third parties, but if you think about it, ANY site that accepts banner ads does this... So if you are surprised by this revelation, I feel sorry for you..
Re: (Score:3)
I want to know which one of the 10 is it?
It's roughly 10% of the top 950,000 sites.
Re: (Score:2)
I'm guessing Wikipedia.
Re: (Score:1)
Re: (Score:2)
extending this... I haven't seen this mentioned on the thread to-date. Some browsers have features to help protect your privacy. Safari and Firefox have a setting to block cookies from third-party sites. So if you visit amazon.com and login, the site can put a login cookie on your computer, but you won't get third-party trackers from omg.zzoba321.gov.co.ru.in.
I'm not going name names, but some browsers notably omit this function, possibly because the browser's developer makes all its money from tracking peo
Wait! Wait! I have a solution! (Score:2)
Re: (Score:1)
Re: (Score:2)
Some ARE kicking down the door.... But we usually call that malware and viruses..
Personally, I hand out "personal information" for a person who is totally fiction beyond the name to any website who requires I give up information to them and I still want to use their website. There are exceptions, of course, but I only share what is required and stick to the identity I invented as much as possible.
Re: (Score:2)
If you think that the data they are collecting is predominantly a result of things being typed into a form... you have no business acting so self-righteous. Instead, you should step back and re-think what privacy is, and how it pertains to the Internet.
Re: (Score:2)
If you think that the data they are collecting is predominantly a result of things being typed into a form... you have no business acting so self-righteous. Instead, you should step back and re-think what privacy is, and how it pertains to the Internet.
Erm, privacy is fucking pulling down the curtains to cock-block anybody getting information that I do not want them to have?
I'm not being self righteous at all - I'm being a master of my own fucking information. Please master yours, or stop bitching about your loss of privacy. If you want to sell your info to get an app, have a nut, but don't bitch if you sold out yourself for a new shiny app.
Re: (Score:2)
ok if your an internet engineer than you undestand that the vast majority of internet data mining is tracking footsteps and breadcrumbs as people travel aroudn the internet and interact with sites, then doing various cross correlations and linking to find insights into what products these people may buy, then showing them ads for this product? And you understand that info about you is leaked by your browser, your computer, your IP address, and your IP?
So even if a user never types in a single thing, lots of
Re: (Score:2)
So even if a user never types in a single thing, lots of info is logged, enought to
Oh God, the internet demons got him. Fuck. They're probably anally probing him as we speak in Guantanamo. Bastards! Life isn't fair.
So, as I said, don't put your personal shit out on the web, or you'll get butt-probed at Guantanamo. Just like Noah.
Re: (Score:2)
The point is that, short of logging off entirely and becoming a luddite hermit, it's incredibly hard to actually accomplish that! I have literally six different anti-tracking browser extensions going (BetterPrivacy, Lightbeam, RefControl, RequestPolicy, Self-Destructing Cookies and uBlock), and whitelisting cross-site requests extremely judiciously, and I still doubt I'm stopping all the tracking!
Re: (Score:2)
Re: (Score:2)
Don't have to give out personal information...
For example go to TireRack (don't log in), look for tires for your car, then come to Slashdot (don't log in), and the first ad you see is an ad for the tires you just looked at on TireRack. Gee either someone is looking at cookies that they shouldn't be or both sites use the same analytics engine and that engine is tracking you across the sites.
But think of the (Score:2)
The heat saved, the cooling not needed as the intensive new encryption was not turned up.
The cash saved in not having expert staff add new encryption that only modern browsers could really use.
All that tracking adds to deeper understanding of the consumers and earns a profit.
All a browser can do is load up on the more useful add ons to try and block most of the more direct site based tracking.
Vice is terrible (Score:2)
Re: (Score:2)
Facebook is the worst offender. Google is number 2. Maybe number 3 depending on where you place Amazon
Re: (Score:3)
From TFA:
"The worst perpetrator is Google, which tracks people on nearly 80 percent of sites, and does not respect DNT signals,"
From the paper:
While there are a number of companies tracking users online, the overall landscape is highly consolidated, with the top corporation, Google, tracking users on nearly eight of ten sites in the Alexa top one million.
and:
That said, half of the top ten images belong to Google, including the most requested image, the Google Analytics tracking pixel. This image is found on 46.02% of sites, is only 1x1 pixels large, and is utilized solely for tracking purposes.
and:
The most striking finding of this study is that 78.07% of websites in the Alexa top million initiate third-party HTTP requests to a Google-owned domain. While the competitiveness of Google is well known in search, mobile phones, and display advertising, its reach in the web tracking arena is unparalleled. The next company, Facebook, is found on a still significant 32.42% of sites, followed by Akamai (which hosts Facebook and other companies' content) on 23.31% of sites, Twitter with 17.89%, comScore with 11.98%, Amazon with 11.72%, and AppNexus with 11.7%.
There's also this little nugget:
More specifically, internal NSA documents leaked to the Post by former NSA contractor Edward Snowden revealed that a Google cookie named "PREF" was being used to track targets online. Additional documents provided to The Guardian by Snowden detailed that another Google cookie (DoubleClick's "id"), was also used by the NSA; in this case to attempt to compromise the privacy of those using anonymity-focused Tor network [19].
Re: (Score:2)
That's the most prevelent. That does not necessarily mean the worst. It especially does not mean the worst when you say that "the worst is on most of the sites", implying that there is some other dimension to consider.
It's hard for me to imagine anyone thinking Facebook's collection of data isn't creepier than Google's.
Re: (Score:2)
Oh, you're talking about the unquantifiable "creepy factor" when you use the term "worst". I assumed you were referring to the data set in TFA.
Re: (Score:2)
Well, that's because GP talked about how Google was most prevalent and also worst. It's redundant if you use TFA's data set to define worst as more prevalent.
And, frankly, Facebook is creepy because it's so much more prevalent than Google. Websites tracking me is just a much lower concern than turning all my friends into unpaid informants on my actions.
Re: (Score:2)
I just saw your username. It's kind of ironic.
Re: (Score:3)
they consider as much as a Google tracking cookie to be "leaking your data"
Well it is, so they're right. Shit man, it's right there in the name. It's not the Google Friendly Cookie, it's not the Google Helpful Cookie, it's not the goddamned Google Blowjob Cookie. It's tracking you. It's the very definition of leaking your data. Maybe what you're confused about is the definition of "your data". Hint: "your data" includes where you go online.
Howsabout Slashdot? (Score:3)
Especially with your mobile site with three rows of full-page-height (at 1920x1200 even) ads and a script popping an ad at the bottom that's almost comically impossible to retract?
All reported on (Score:5, Insightful)
Re:All reported on (Score:4, Interesting)
Ghostery blocked the following on motherboard.vice.com:
Alexa Metrics
ChartBeat
Facebook Connect
Google Ajax Search API
Google Analytics
Google+ Platform
Krux Digital
Netratings Sitecensus
Pinterest
Quantcast
Sailthru Horizon
Scorecard Research Beacon
Twitter Button
Re: (Score:2)
When you load the Facebook Connect code from a third-party site, that is Facebook tracking you. Facebook knows that your browser requested their code from the other site. They can tell which site it is. They know that you visited that site. And, if you happen to have a valid login session at Facebook or did recently, then the site you visited probably also knows your identity on Facebook. All of this data gathering is part of the ability to post to Facebook. That is how Facebook makes its money, it se
Re: (Score:1)
Here's proof:
https://www.facebook.com/moron [facebook.com]
What if you don't use "top" websites? (Score:2)
Re: (Score:1)
Damn it, I thought I was logged in when I posted that. If you reply to the "does anti-tracknig [sic] software protect against this?" topic, please reply to this so I get a notification.
Re: (Score:2)
http://yro.slashdot.org/story/... [slashdot.org]
The Fingerprinting wiki https://wiki.mozilla.org/Finge... [mozilla.org] has some of the more unique methods to track users.
Soon tracking and ads will just be part of the site as functionality. Try and remove ads, tracking and the page, site is reduced to a title. No text, video, comments unless all tracking blockers are removed. Hard work for creators per pa
Re: (Score:2)
Ghostery does help some, but I highly doubt it will ever near 100% in terms of stopping tracking. As posted elsewhere in this thread, Privacy Badger would be another extension to look into. I don't see a problem with running multiple extensions. Adblock plus is fine for just stopping ads, and obviously Noscript is the heavy-handed way to stop a lot of this stuff also.
Slashbot? (Score:1)
One story [here] just answered another on [slashdot.org]
Crap article is basically an advertisement (Score:2)
This is a crap article and just pushing for the tool the guy built.
All the tool tells you is that the site makes 3rd party requests (Ghostery does a lot better job at this than some random bundle of python scripts). It does not tell what any of those 3rd party requests are doing, nor whether any personal data is being "leaked" by the site itself. Nor does it tell you if the site is pushing data wholesale on the backend to 3rd parties.
it's your browser (Score:2)
The website isn't the leak. It just politely asks your browser to leak, and the browser naively complies. FWIW, people are sort of finally on this (e.g. PrivacyBadger) though we're still in the very early days of people-giving-a-fuck.
One website, one domain. (Score:2)
Pretty soon instead of blackholing domains I don't trust, I'm going to to have to start whitelisting the few that I do trust. Nice job corporate assholes, you ruined the internet.
Is there a plugin? (Score:2)
Comment removed (Score:3)
Re: (Score:2)
If everybody did this there would be no value in your data. Sour the milk.
You're confusing data quality and data marketability. While your proposal would diminish data quality, data quality is already pretty low as far as I can tell based on the supposedly "target" ads I see. But despite the fact that it's already unreliable at best, the companies collecting the data are still able to monetize it quite thoroughly, and will continue to do so no matter how bad the data gets. The companies (and governments) buying the data just want an excuse to do more of what they're doing. They d
For instance... (Score:2)
For instance Slashdot: (orginally posted as AC)
jadserve.postrelease.com
cdn.taboola.com
The following domains don't appear to be tracking you
www.googleadservices.com
cdn-social.janrain.com
cdn.quilt.janrain.com
player.ooyala.com
widget-cdn.rpxnow.com
slashcdn.com
s.ntv.io
So does the USPS! (Score:1)
Hey - if the NSA can do it... (Score:1)