Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Privacy Your Rights Online

Data Miners Scraping Away Our Privacy 142

Presto Vivace writes "Twig, writing for Corrente, reports on data scrapers. They are not looking for passwords and such; scrapers are looking at blogs and forums searching for material relevant to their corporate clients. We are assured that the information is 'anonymized' to protect the identities of forum participants. However, a tool called PeekYou permits users to connect online names with real world identities. No worries, though — if you have a week to spare, you can opt-out of some of the larger data banks."
This discussion has been archived. No new comments can be posted.

Data Miners Scraping Away Our Privacy

Comments Filter:
  • Whew... (Score:3, Informative)

    by hrimhari ( 1241292 ) on Friday October 15, 2010 @11:11AM (#33908376) Journal

    If the only thing I have to fear about is PeekYou, then I'm utterly anonymous. [peekyou.com]

  • I have no doubt... (Score:2, Informative)

    by sudden.zero ( 981475 ) <sudden.zeroNO@SPAMgmail.com> on Friday October 15, 2010 @11:18AM (#33908450)
    that this is a real problem as I have personally experienced problems with data scrapers, scraping my data. However, this tool they are talking about (PeekYou) couldn't find a stripe in a pack of fruit stripe gum. I looked up several of my handles and several of my friends handles and was not able to find anyone. Then I looked up real names and was still unsuccessful. So, don't worry about (PeekYou) worry about people doing actual data-scraping the old fashioned way.
  • by panda ( 10044 ) on Friday October 15, 2010 @12:27PM (#33909320) Homepage Journal

    Except that robots.txt is not enforceable in any way. Spiders can ignore your robots.txt, and I've even seen some that actually spider what's in robots.txt looking for the "juicy" stuff.

    One solution that an associate came up with was to put a url in the robots.txt that could not be reached from the normal site. The URL, when accessed, would run a program that instantly blocked the client IP address in the server's firewall. After implementing this, he very quickly accumulated thousands of entries in the firewall table.

Never test for an error condition you don't know how to handle. -- Steinbach

Working...