Data Miners Scraping Away Our Privacy 142
Presto Vivace writes "Twig, writing for Corrente, reports on data scrapers. They are not looking for passwords and such; scrapers are looking at blogs and forums searching for material relevant to their corporate clients. We are assured that the information is 'anonymized' to protect the identities of forum participants. However, a tool called PeekYou permits users to connect online names with real world identities. No worries, though — if you have a week to spare, you can opt-out of some of the larger data banks."
Whew... (Score:3, Informative)
If the only thing I have to fear about is PeekYou, then I'm utterly anonymous. [peekyou.com]
I have no doubt... (Score:2, Informative)
Re:Opt-IN should ALWAYS be the default (Score:3, Informative)
Except that robots.txt is not enforceable in any way. Spiders can ignore your robots.txt, and I've even seen some that actually spider what's in robots.txt looking for the "juicy" stuff.
One solution that an associate came up with was to put a url in the robots.txt that could not be reached from the normal site. The URL, when accessed, would run a program that instantly blocked the client IP address in the server's firewall. After implementing this, he very quickly accumulated thousands of entries in the firewall table.