Research To "Reveal the Unseen World of Cookies" 108
An anonymous reader writes "The Guardian newspaper has teamed up with Mozilla to research the monitoring of online behavior through cookies and other web trackers. After downloading the Collusion add-on for Firefox, you can generate a visual representation of all the cookies that have been downloaded which are linked to the sites you have visited. This shows quite an interesting picture. The Guardian staff then want the data from Collusion to be uploaded to their site, after which they say 'we can build up a picture of this unseen world. When we've found the biggest players, we'll start tracking them back — finding out what data are they monitoring, and why.'"
Great Idea (Score:3, Interesting)
Internet marketing (Score:5, Interesting)
If average folks become aware of how many cookies get set (along with getting a user-friendly way* of turning them off), that could have a huge and entertaining effect on the world of Internet marketing**.
For example, right now, I can assume enough website visitors have JavaScript enabled to make it almost 100% (and not worth writing HTML for the case where they don't). But if I can only reasonably assume, say, 50% of my visitors/email through-clickers/etc. have cookies active, that plays havoc with my reporting.
* "User-friendly" defined as "something my dad can do without asking me for help".
** I spend all day every workday in this world.
"What Data they are monitoring and why" (Score:4, Interesting)
It will be interesting to see not only the results of this analysis, but also how they came any conclusions that they do.
Many cookies are used only to store a unique identifier. They data about a user many websites actually store is housed and maintained on their server, keyed by the unique id. This could include "pages visited", "duration of visit", "browser/system specs/settings" along with any derived demographic data.
It would be hard (though not necessarily impossible) to determine this from a cookie analysis.
Collusion is quite fascinating... (Score:4, Interesting)
Re:Pot kettle spy. (Score:5, Interesting)
Hi,
I'm the Guardian journalist working on this.
Unsurprisingly, if you install Collusion after reading an article on The Guardian, you tend to log cookies that our website sets. So we're noticing quite a few of the trackers we use on guardian.co.uk turn up in the project. :)
We're ok with that - better to be open that our website uses cookies for registration, analytics and advertising (just like most others!), than pretend or hide away the fact. Actually, we did another article on the same day showing how we use them: http://www.guardian.co.uk/technology/2012/apr/13/new-law-cookies-affect-internet-browsing.
The ones in that list above are a mix of third-party advertising cookies, analytics and A/B testing (so I'm learning!).
When it comes to the data we're going to try and get from the Collusion info - we can't really infer much about what behaviours have been tracked from the exported data. However, it gives us a nice long JSON string that associates certain cookies as being set when visiting certain sites. At the moment we're using that to find out how many instances of each type of tracker we're seeing across multiple sites.
We're then going to take the most prolific ones and find out more about what they do, who owns them, how they work, etc. However, we're going to be using old-fashioned journalism to do that - research and phone calls.
However, I was thinking of putting up open documents like this: https://docs.google.com/document/d/1lCp8H9i-MJwyORj_MOZflH6BCt9j6HIbQkyS2536knM/edit
so you could see where I'd got to and put me right if I was going off track (as it were). Good idea? Bad idea?
Joanna.