Research To "Reveal the Unseen World of Cookies" 108
An anonymous reader writes "The Guardian newspaper has teamed up with Mozilla to research the monitoring of online behavior through cookies and other web trackers. After downloading the Collusion add-on for Firefox, you can generate a visual representation of all the cookies that have been downloaded which are linked to the sites you have visited. This shows quite an interesting picture. The Guardian staff then want the data from Collusion to be uploaded to their site, after which they say 'we can build up a picture of this unseen world. When we've found the biggest players, we'll start tracking them back — finding out what data are they monitoring, and why.'"
Great Idea (Score:3, Interesting)
Re:Great Idea (Score:5, Funny)
Is there an equivalent of Collusion for Chrome?
I believe it's called Google Ads ;-)
Re:Great Idea (Score:4, Insightful)
You mean those ads which are displayed on all browsers and are in no way tied or targeted to Chrome?
Either you're a troll or that was a bad joke.
It's interesting to me how someone joking around is considered a 'troll' to you. You are what is wrong with /. these days. 'Troll' is the new 'I disagree with you'.
Did he REALLY evoke an emotional response from you by saying what he said? Did it truly upset you to the point where you were incensed and bitter over his words? If so, then maybe he is, indeed, a troll. Otherwise, shut up.
Re: (Score:2)
It's interesting to me how someone joking around is considered a 'troll' to you.
That's flat out dishonest of you. I said either a troll or a bad joke. In other words, I found the joke so bad I couldn't be sure if they were really trying to make a joke or actually trolling, or both.
You are what is wrong with /. these days. 'Troll' is the new 'I disagree with you'.
Wrong. I'm not saying they might be trolling because I disagree. I said it because the entire premise of the joke is factually without basis. There
Re: (Score:1)
*snip*
You ramble on and on with your inane and worthless counter arguments. Trolls have had the same definition online for AGES now, and you've gone out and created your own definition of what a troll is for the context of this story and discussion. If you do not understand those basic things then, by all means, shut up as I have previously requested of you
You also didn't merely "seem" serious. You obviously WERE serious. So trying to lie your way out of your stupidity is just making you look worse. Also, you
Re: (Score:1)
Re: (Score:3)
I'm not saying they might be trolling because I disagree. I said it because the entire premise of the joke is factually without basis.
Hate to break the news to you, but jokes don't have to be factually accurate or even vaguely plausible.
You seem to have issues with people criticising Google in a humourous way. I suppose at least you're not an Apple fanboy, but a Google fanboy isn't much better.
Get over it, they're computer companies not our Lord Jesus Christ.
Re: (Score:2)
When we've found the biggest players, we'll start tracking them back — finding out what data are they monitoring, and why.
I can answer this entire thing in 2 seconds. Porn, so they can sell it to you. In that order.
Re: (Score:2, Insightful)
Who goes on the internet to BUY porn?!
Re: (Score:2)
Who goes on the internet to BUY porn?!
Well a fair amount of people obviously do or you wouldn't get so much advertising, would you? Advertisers don't do it for the fun of it.
Re: (Score:1)
When we've found the biggest players, we'll start tracking them back — finding out what data are they monitoring, and why.
And then we'll sell the info back to them!
Re: (Score:1)
Is there an equivalent of Collusion for Chrome?
Yes
Re: (Score:2)
How to get rid of them (Score:5, Informative)
On Firefox, disable HTML5/DOM storage, install CookieMonster 1.5 and BetterPrivacy.
Re: (Score:2)
On Google Chrome, the first thing to do is disallowing third-party cookies:
Settings -> Under the Hood -> Content Settings -> Block third-party cookies and site data
Re: (Score:2)
Re: (Score:2)
Protect the whole network.
I was thinking of making a squid filter that replaces cookies to known adentity sites with any variable data changed to random data of the same length and composition.
In my opinion, if a server wants to inform other sites about my visit there, fine, do so, but then they need to contact the sites, not trick me into doing it for them.
Pot kettle spy. (Score:5, Insightful)
we'll start tracking them back — finding out what data are they monitoring, and why.
Well, here's my contribution;
The Guardian page in the link has six trackers:
24/7 Real Media
Audience Science
ForeSee
Maxymiser
Optimizely
Quantcast
I don't know what any of them do, and I blocked them all. Fuck 'em.
Re: (Score:3)
24/7 Real Media
Audience Science
ForeSee
Google Adsense
Maxymiser
Omniture
Optimizely
Quantcast
Twitter Button
Re: (Score:1)
Re:Pot kettle spy. (Score:5, Funny)
Story of my life. I brag about having 6, and the other guy has 9.
Re: (Score:2)
Go out and buy a couple cases of boosters and then you won't have to deal with the guy who brags about how many decks he has in his backpack./p
Re: (Score:2)
Re: (Score:3)
You missed some more!
googleapis
simplifydigital
guim
llnwd
ophan
ytimg
youtube
quantserve
wunderloop
revsci
cogmatch
imrworldwide
I'll leave it as an exercise for the reader to de-dupe the above list (e.g. quantserve Vs quantcast and ytimg Vs youtube) and decide for themselves which ones are innocuous.
I didn't even bother to let any of them run any javascript to discover what else they might try to sneak in. I'm also willing to bet I missed something.
You have to love the "obfuscation" and attempts to get p
Re: (Score:2)
That's not an attempt to get past blocking. It's a necessity to get the HTML parser.
Re: (Score:2)
To get past the HTML parser.
Re: (Score:2)
Nope.
document.write("<script type='text/javascript'>");
works just fine. You're thinking of the closing tag:
document.write("</scr"+"ipt>");
is a necessity to get past the HTML parser.
Re: (Score:3)
Re: (Score:1)
Re:Pot kettle spy. (Score:5, Interesting)
Hi,
I'm the Guardian journalist working on this.
Unsurprisingly, if you install Collusion after reading an article on The Guardian, you tend to log cookies that our website sets. So we're noticing quite a few of the trackers we use on guardian.co.uk turn up in the project. :)
We're ok with that - better to be open that our website uses cookies for registration, analytics and advertising (just like most others!), than pretend or hide away the fact. Actually, we did another article on the same day showing how we use them: http://www.guardian.co.uk/technology/2012/apr/13/new-law-cookies-affect-internet-browsing.
The ones in that list above are a mix of third-party advertising cookies, analytics and A/B testing (so I'm learning!).
When it comes to the data we're going to try and get from the Collusion info - we can't really infer much about what behaviours have been tracked from the exported data. However, it gives us a nice long JSON string that associates certain cookies as being set when visiting certain sites. At the moment we're using that to find out how many instances of each type of tracker we're seeing across multiple sites.
We're then going to take the most prolific ones and find out more about what they do, who owns them, how they work, etc. However, we're going to be using old-fashioned journalism to do that - research and phone calls.
However, I was thinking of putting up open documents like this: https://docs.google.com/document/d/1lCp8H9i-MJwyORj_MOZflH6BCt9j6HIbQkyS2536knM/edit
so you could see where I'd got to and put me right if I was going off track (as it were). Good idea? Bad idea?
Joanna.
Re: (Score:1)
Urgh. Sorry. Forgot to log in. o_O
Re: (Score:1)
Re: (Score:2)
Hi Joanna,
A typo from the linked-to page:
Should be:
Good luck with the project, it's an interesting one.
Re: (Score:1)
It's not called the Grauniad for nothing.
Methodology Issues (Score:3)
You know already who the "Big Players" are - Google, Facebook, Microsoft, your choice of a couple more related ones.
Then it descends into all these little companies. I would expect that some of them are subsidiaries of the big guys etc.
The ideal goal of each of these "thingies" (cookies, flash objects, etc etc) is to nail down who visits down to a unique user if possible.
So just copy the Ghostery block list, maybe the AdBlock block list, your choice of a couple more tools.
If you want a "market share per ad
Re: (Score:2)
So just copy the Ghostery block list, maybe the AdBlock block list, your choice of a couple more tools.
Guardian does seem to be re-inventing the wheel a bit. Ghostery (Evidon/Better-Adertising/Direct-Advertising-Assoc) already has not just a public list of tracking companies, but a page of info about each one.
Whereas Collision seems more about displaying the connections ("collisions") between known trackers that you personally encounter, not collecting new info for a data dump.
I like the Guardian, and I appreciate the journo sticking her head in the lions den, but it seems to me she&they would achieve mo
Re: (Score:2)
However, I was thinking of putting up open documents like this: docs.google.com/blah so you could see where I'd got to and put me right if I was going off track (as it were). Good idea? Bad idea?
Putting this stuff on google is like asking the NSA to host wikileaks ... bad idea.
Re: (Score:1)
Re: (Score:1)
Re:Website (Score:2)
Sure, why can't you host your notes at something like http://www.guardian.co.uk/JGeary/CookieStudy.html [guardian.co.uk]?
Then just keep uploading new iterations of the page.
And I figured out part of what was bothering me. You're asking for "data for research" but your initial article is "shadowed" - it reads like "give us data and we'll figure out what we want to write about".
Write two versions of your story: the Mass Market one "Look, it's 2012, we found all these cookies! They're evil!" and the other with a FAR More rigor
Re: (Score:2)
You seem to have access to a website you could already publish it on no?
Failing that for whatever reason you could put it in a wiki on branchable [branchable.com]? No I'm not affiliated to them in any way but they were the first "good" answer which jumped to my mind.
More obscure but perhaps extra appropriate for the topic at hand, you could publish it on a "hidden service" on tor?
Re: (Score:3)
Eh? If Ms Geary puts it anywhere public online, google can see it anyway. (As can the actual NSA.) So unless you're saying that Google will censor her work, your comment makes no sense.
Re: (Score:2)
How did you block them?
I was thinking about adding null direction to 127.0.0.1 in /etc/hosts file, but perhaps there is a better way?
Re: (Score:2)
Firefox + Ghostery
+ABP +NoScript +WOT +no-third-party-cookies...
I didn't think I was especially paranoid (I have a google account, don't use on-disk or in-mail encryption, etc) until I realised that this isn't how most people think.
Cookieculler (Score:5, Informative)
I have never found anything that matches cookieculler for features: it doesn't just purely delete cookies, it operates with a white-list based system (the way everything on the web should work). Cookieculler deletes all cookies each time you close the browser, except the ones you have whitelist "protected", that keep login information etc. as you choose.
Along with noscript, cookieculler is the main reason I stay on firefox.
Re: (Score:3)
Re: (Score:3)
Citation really needed.
Re: (Score:2)
How is cookieculler different from setting a default policy in Firefox and then using the built-in whitelist in Firefox to give permissions for certain sites?
Re:Cookieculler (Score:5, Informative)
Re: (Score:2)
I like going one step beyond CCleaner. I use sandboxie on my browsing sessions. This provides four benefits:
1: My Web browsing is redirected to another volume. This means that cookies and other stuff are not stored on my main application or data drives, but are separated. This keeps potential malware as separated as one can get from the system without resorting to actualy VMs.
2: When I close the Web browser, all stored stuff is gone, guarenteed. There is no worry about hidden cookies, LSOs, or any ot
I don't see the irony in it (Score:1)
Protect yourself from tracking websites by this addon that collects all your cookies and sends it to us!
Re: (Score:2)
That was my reaction too.
Combined with the technology being used, not installing it was a given.
Re: (Score:1)
Cookies or COOKIES!?!? (Score:2, Informative)
Anyone else read the title and thought people were taking a deeper look at why those delicious baked goods are so tantalizing?
Bah ... (Score:2)
I read the title, and get all excited ... and then read the summary to find they're not talking about the Girl Scouts, Nabisco, or other things that might involve sugar and chocolate chips.
And now that I got my hopes up, I'm going to go see what's in the vending machine. There's usually animal crackers, at the very least.
Re: (Score:2)
It depends on if they track the cookies from the Girl Scouts website.
Internet marketing (Score:5, Interesting)
If average folks become aware of how many cookies get set (along with getting a user-friendly way* of turning them off), that could have a huge and entertaining effect on the world of Internet marketing**.
For example, right now, I can assume enough website visitors have JavaScript enabled to make it almost 100% (and not worth writing HTML for the case where they don't). But if I can only reasonably assume, say, 50% of my visitors/email through-clickers/etc. have cookies active, that plays havoc with my reporting.
* "User-friendly" defined as "something my dad can do without asking me for help".
** I spend all day every workday in this world.
Facebook (Score:5, Informative)
Re: (Score:3)
Spoiler: It's practically every site.
Re: (Score:2)
I've been doing that for a while now as it's much simpler and, once you've gone through the initial setup
Re: (Score:2)
Re: (Score:2)
You'd be shocked at how many cookies come from facebook across multiple sites. I use an extension called Ghostery (https://addons.mozilla.org/en-US/firefox/addon/ghostery/) to block most of them.
I use Ghostery plus RequestPolicy [requestpolicy.com] which gives you control over every single external request that a web page makes. It is like a noscript for cross-site references of any kind.
Re: (Score:1)
+1 for RequestPolicy, although I have to say when I restart my browser, then immediately find my self staring at a FUBAR page, I usually just hit "temporarily allow all requests" and get on with life, tracked as I may be. I do log out of facebook each time and delete facebook.com cookies, but I suspect that facebook still tracks me on other domains they control. I am like a tiny tiny person shaking a tiny tiny fist at the giant.
Re: (Score:2)
I use AdBlock Plus to nix the Facebook tracking. At the cost of seeing "Like" buttons everywhere I go (yes, that's a joke), these filters or some similar will do the trick:
||facebook.com^$third-party,domain=~facebook.net|~fbcdn.com|~fbcdn.net
||facebook.net^$third-party,domain=~facebook.com|~fbcdn.com|~fbcdn.net
||fbcdn.com^$third-party,domain=~facebook.com|~facebook.net|~fbcdn.net
||fbcdn.net^$third-party,domain=~facebook.com|~facebook.net|~fbcdn.com
You will occasionally see a button when the image is hosted
Yo Dawg (Score:3, Funny)
"What Data they are monitoring and why" (Score:4, Interesting)
It will be interesting to see not only the results of this analysis, but also how they came any conclusions that they do.
Many cookies are used only to store a unique identifier. They data about a user many websites actually store is housed and maintained on their server, keyed by the unique id. This could include "pages visited", "duration of visit", "browser/system specs/settings" along with any derived demographic data.
It would be hard (though not necessarily impossible) to determine this from a cookie analysis.
Collusion is quite fascinating... (Score:4, Interesting)
Research To "Reveal the Unseen World of Cookies"? (Score:1)
No research needed, the truth [wikipedia.org] about the unseen world of cookies has been known since 1968. They're made in a hollow tree by elves [keebler.com].
Cookie scrambler (Score:2)
porn (Score:3)
finding out what data are they monitoring, and why
Well, all the porn websites seem to know that I prefer brunettes over blonds.
Re: (Score:2)
Cookies not the only way to do this... (Score:5, Informative)
Cookies are not the only evidence of tracking. Even Flash LSO, HTML5 local storage, etc.
There's a surprising amount of identifying information in request headers and what's available to javascript. (see http://panopticlick.eff.org/ [eff.org] for a demonstration.) That means, one often needn't accept or store a cookie to be tracked.
A really comprehensive pro-privacy browser extension would munge request headers and enumeration of fonts, plugins, screen resolutions, etc. to match one of, say, the top 5 most common desktop browser fingerprints - and to change every so often (Changing per request would itself be a trivially detectable signature.)
-Isaac
C is for cookie, it's good enough for me (Score:1)
Gooble gooble gooble. ..Huh, what's that? Wrong type of cookies? Oh....
I love Cookie Monster. He taught me the best places to hide my cookies as a kid.
Ghostery started tipping me off to how much stuff I was missing. I'm in the process of whitelisting sites, which is a pain with all the underlying stuff lying around.
How about the unseen world of javascript? (Score:2)
Re: (Score:2)
That's the one people should be the most concerned with. When I first started using NoScript, I was stunned at how many supposedly reputable sites were using javascript pulled from ten or twenty different unrelated sites. There's just NO good excuse for that at all.
Agreed - quite amazing. And how insidious FaceBook is...
Slashdot uses DoubleClick, Google Analytics and (Score:1)
Sad (Score:2)
It's not compatible with 3.6, which I prefer over the UI of later versions.
Wonder how many data points that will lose them.
Chocolate chip spectroscopy, anyone? (Score:1)
>< n/t