Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Google Businesses The Internet Your Rights Online

Google to Anonymize Users' Search Data 151

Google's official blog states they are on an effort to anonymize their search data after 18-24 months. After previously fighting turning over search data to the feds, it looks like they are striking another blow to the "think of the children" crowd. Any bets on whether MSN or Yahoo! will follow suit?
This discussion has been archived. No new comments can be posted.

Google to Anonymize Users' Search Data

Comments Filter:
  • The real WTF is.. (Score:2, Interesting)

    by b100dian ( 771163 )
    ..the "off the record" button, in the first place!
    • Re: (Score:3, Insightful)

      by jacquesm ( 154384 )
      I never got why google needs to keep all that history without anonymizing it.

      There is - as far as I can see - no rational argument that has to do with improving search results because you have them tied to individuals.

      And yes, keeping tabs on half the globe is evil too...

      • Re: (Score:2, Interesting)

        Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today?
        • by Dunbal ( 464142 ) on Thursday March 15, 2007 @08:00AM (#18360623)
          Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today?

                Studies have shown that 43% of all people who search for "Donkey Love" will buy our product within 3 years if they see our ads.
        • The only way to know for sure is to keep records of people's searches for 2 years :)
        • "Not only that, but is the history of searches you made over 2 years ago relevant to your current searches performed today"

          It is to Google as they want to know more about you, so they can build up a clearer profile about you. Just because they (say they) are going to delete the data after 2 years, doesn't mean they will not use the data in that two years to build up a profile about what you like. Then they can still keep updating that profile over time while deleting data. So even once they delete the dat
      • by dammy ( 131759 )
        Let's see here, they are worried about turning data over to the US government but they have no qualms about getting on their knees to the Communist Chinese government? Am I ever glad I no longer spend my company's money on AdWords.

        Dammy
      • Individuals? You mean "BigFatMamma2002" or "BigBirdDork18m"?
      • There's something called abuse. There are scripts that might use google to search terms like the name of a full of exploits gallery system in order to get a list of vulnerable pages, there are people who want to modiffy google trends results, there are actually lot of reasons to abuse web search, so I would have made a logger as well...
      • ADVERTIZING (Score:3, Insightful)

        by everphilski ( 877346 )
        it's all about the advertising. Google's knowlege of you lets them advertise to you more effectively.
      • This information is very valuable as an ad provider. Just do a little data mining [wikipedia.org], and you will find stuff like "people who search for pregnancy cloth 5 years ago are more likely to click on child cloth ad today" and many other not so obvious relationships.

        The only reason google is willing to throw this information away (and money with it) is because customers are concerned about their privacy.

        • Re: (Score:2, Insightful)

          People searching for their social security numbers just for the hell of it, or their CC numbers, and presto! Now real numbers exist in some "Google history list" for ever and ever.

          There's a goldmine of data there. "Anonymizing" it doesn't affect this, unless they have filters to try to recognize such and get rid of it.

          Still, if it's in the form of "User X" searched for these 132 terms last month, some terms might identify them and hence link them to other things like their unfortunate search for "donkey l
      • They keep it for purposes such as personalized search results and the like.

        A.I. is also far more effective at linguistic analysis (which Google may wish to introduce in the future, if they haven't already) when relations between results are known, and can be mapped to one user.

        The type of things a single user would search for are often limited to certain categories of knowledge and thus a linguistic analysis engine could determine query relations which would improve search results for future users.
      • Not if it helps to catch pedophiles.

      • No, keeping tabs on half (or even all) of the globe is NOT evil. If you don't want anyone to keep tabs on you then you always have the simple and easy option of committing suicide.

        If you don't, then either you don't mind Google keeping tabs on you, or you are a wuss.
      • It Is About Context (Score:3, Interesting)

        by EXTomar ( 78739 )
        It isn't that Google necessarily care that it is "you" (actually they might but that is another thread...), but "you" are doing a search and then clicking on links in a particular order which is a context that is important for ranking. At an abstract level, the relationship between what you searched and the links you tried is stuff Google wants to track to help enhance relevancy and search results. The problem is that with modern technology to do this they need to know somethings that aren't anonymous whi
  • Uhm (Score:3, Interesting)

    by giorgiofr ( 887762 ) on Thursday March 15, 2007 @06:34AM (#18360069)
    All they have to do is erase the logs every day or just not keep them. It doesn't "take an effort". Anonymous proxies have been doing this for years.
    • Re:Uhm (Score:5, Insightful)

      by Rakishi ( 759894 ) on Thursday March 15, 2007 @06:50AM (#18360147)
      And anonymous proxies do not need to make money or provide much of a service unlike google, logs are very useful for such things.
    • Re:Uhm (Score:5, Insightful)

      by Whiney Mac Fanboy ( 963289 ) * <whineymacfanboy@gmail.com> on Thursday March 15, 2007 @06:51AM (#18360153) Homepage Journal
      All they have to do is erase the logs every day or just not keep them. It doesn't "take an effort". Anonymous proxies have been doing this for years.

      I know where you're coming from, but that would kinda fuck with their targetting advertising business model dontcha think?
      • Re: (Score:3, Insightful)

        by jacquesm ( 154384 )
        it doesn't have to, after all the targetted ads are supposedly targetted to the *content* of the pages and your search query. No need to keep that for two years in order to target it better unless you have other plans with my data (such as selling my 'profile').

        • Re:Uhm (Score:4, Insightful)

          by daeg ( 828071 ) on Thursday March 15, 2007 @07:43AM (#18360481)
          I'm between the two extremes of agreeing with you and agreeing that data needs to be retained. As any of us who have taken a statistics class (or four) can tell you, you don't need access to the whole sample to provide accurate data. So, say, for instance, the Google engineers were working on a specific niche of the web, say, dog lovers. If I were designing something to better suit dog lovers, my first step would be pulling a report on the common search patterns of people that search for dog-related topics.

          Historical data that identifies a unique user is extremely useful. I do the same thing with our Intranet search and report tools. If I want to improve something, oftentimes the logs will give a very telling tale. (This accounting department employee searched for "expense", then "expense excel", then "expense spreadsheet", then "expense log", finally getting his document. I can then add the keywords 'excel' 'spreadsheet' to the actual document entry.) That said, you don't actually need to know who the unique user is, for all intents and research purposes, User5486734067 is just as useful as an IP+Cookie.
          • Even for the example you give I would not need to know *who* made those searches.

            There are two good reasons to keep the data, as far as I can see, the first is to avoid sending
            the same ad to someone twice (but for that you only need a history of what ads they've seen, not
            what they have searched for, though of course that does help to tag a user as a 'programmer' or
            an 'accountant'), the second is when you go in to the massive selling of profiles business.

            There are some companies that do this (Schober comes t
            • by daeg ( 828071 )
              I never said they need the "who", just the unique ID to chain the searches together.

              From my experience with AdSense, Google doesn't give direct access to any of the information. In fact, it makes sense for them to strongly protect their profiles. If they sell them, they lose control over them. Sure, they can retain legal control, but once they're out, they're out. Google isn't dumb, they'd rather make $1 for every profile access versus $100 up front, as the $1s will add up over time (not actual dollars, jus
              • true enough, apologies.

                As for the search profile study that was AOL's blunder, and after examining
                the data that AOL provided in some detail (several weeks worth of work) I am
                absolutely amazed at how privacy invasive this stuff is.

                That is why I'm eagerly awaiting a competitor to the big G that has a really
                strong privacy statement.

                If the quality is anywhere near comparable I'll switch in a heartbeat. But I
                do not doubt that I'd be one of very few people to do so. Not because I have
                something to hide, just becau
                • by Rakishi ( 759894 )
                  That is why I'm eagerly awaiting a competitor to the big G that has a really
                  strong privacy statement.


                  None ever will most likely, not enough people care and G will simply kick their ass due to having better data to model things with. Search isn't exactly an easy field to break into right now , there is possibility for more or less niche search engines but not at a google/yahoo/msn level. Sure someone could make a brilliant new algorithm but then it's very unlikely that they'd also have a strong privacy polic
        • by Rakishi ( 759894 )
          Like the other poster said, depending on what a data analyst is doing such information could be useful. IPs I may ad could be useful on their own to provide geo-location and ISP information for example.

          For example, it is likely that google alters its search page with different setups to test various things in which case your long term reaction to such different ad methods could be useful. Likewise seasonal trends require long term data to find. There is a big difference between using data in production and
  • Mine already is (Score:3, Informative)

    by solevita ( 967690 ) on Thursday March 15, 2007 @06:36AM (#18360085)
    Although I did have to install the AnonymizeGoogle Firefox plugin to get it.
    • Re:Mine already is (Score:5, Informative)

      by solevita ( 967690 ) on Thursday March 15, 2007 @07:24AM (#18360345)
      Ignore that post above - I'm a moron. I meant to say CustomizeGoogle Firefox plugin .Get it here [customizegoogle.com].

      I guess that's what happens when you Slashdot before caffeine. I'm sorry.
      • I meant to say CustomizeGoogle Firefox plugin

        That helps.

        Of course, if you want to shorten log retention further than Google's "only 2 years!", you can go through a proxy like Anonymizer [anonymizer.com] or Tor [eff.org]. If the fullbore proxies are too much of a hassle, there's always the search proxies like Scroogle Scraper [scroogle.org] (where the log retention is 48 hours).

        Another approach is to poison the data mine with TrackMeNot [nyu.edu] by generating thousands of random searches in the background.
  • anonymizing it straight away! That would be an even quicker solution to the problem.

  • Why not anonymise the data after zero months? Are they required by law not to?
    • by Barny ( 103770 )
      In some countries, yes, they are required to.
    • Re:0 months? (Score:5, Insightful)

      by cdrudge ( 68377 ) on Thursday March 15, 2007 @06:49AM (#18360137) Homepage
      My guess is they don't do it immediately is because there is internal business value in mining the data. User patterns, length of stay, etc. After 18 or 24 months, the internal value has dropped significantly as things change quickly. I would have thought that the value would have dropped even quicker then that, say after 6 months or maybe a year.
      • This isn't quite true yet. Most people who use the internet are not very savvy when it comes to protecting their privacy. With everyone having gmail accounts, they can effectively trace a person's search habits over years. Especially if people use the same computer, or log in to gmail before every session. So no, the data doesn't become less useful for most users. On the contrary, it becomes more, and utility is only going to increase as google releases more and more services.

        An example off the top of my he
    • by Rakishi ( 759894 )
      Even if they weren't legally required it makes more business sense to keep as much data as possible as you never know when someone will need it for some project.
    • by xxxJonBoyxxx ( 565205 ) on Thursday March 15, 2007 @07:30AM (#18360381)

      Why not anonymise the data after zero months?
      Because Google's primarily a media company, like NBC, only with much finer detail about what you want to see. Like any media company, Google finds demographic data incredibly valuable because it allows them to "connect" you with the "correct" advertisers. There's no way in hell Google would let people be completely anonymous; it goes against their business plan. (I'd also bet three years from now we'll find through some court case that backup tapes somewhere really extend "anonymous after 18 months" to 4-5 years.)
  • by Anonymous Coward
    Google should not be collecting any of that huge pile of information AT ALL, not just anonymising it after 18 months. As the AOL case showed, search queries can be used to identify individuals even after AOL anonymized them, so it's not IP addresses they are recording, it's PEOPLE.

    There is no need to collect the IP addresses of searchers that haven't opted in to Google's personalized search. There is no law, that requires it.

    There is no need to store the IP addresses of individual visitors to websites when
    • Re: (Score:3, Insightful)

      by GweeDo ( 127172 )
      There is no need? What about the monetary need? Google doesn't really care who you are, but they do care about what you are looking for. The more they know about what you are looking for the better their AdSense program can do. The better it does, the more money they make.

      As for your whole you "we have privacy" bit, sure you do. In your own home while using your stuff. The moment you sent your request out over the internet in plain text to a third party (that is a corporation out to make money you kno
      • No Consent (Score:4, Interesting)

        by Anonymous Coward on Thursday March 15, 2007 @08:02AM (#18360663)
        Exactly, it's to Google's MONETARY benefit that they record this information. The EU Privacy law says THEY CANNOT RECORD MORE PERSONAL INFORMATION THAN IS NEEDED FOR A TRANSACTION. Now that it's clear that search data is personally identifiable, the EU Privacy law should be used to FORCE GOOGLE TO QUIT IT.

        "The moment you sent your request out over the internet in plain text to a third party (that is a corporation out to make money you know) you lost that."

        Not so, the law says we have to consent and we didn't consent!

        And what about when that party isn't Google? Google analytics is not on Google's site, it's embedded on third party sites, Google's adsense is on other people's site too. I didn't consent to handing my data to Google when I surfed to third parties site, Google took that data and recorded it in violation of EU privacy laws.

        This has also been sued for before resulting in Doubleclick backing down over exactly this issue.

        http://archives.cnn.com/2000/TECH/computing/01/28/ double.click.lawsuit.idg/ [cnn.com]

        "A California woman has filed suit against DoubleClick, accusing the U.S.-based online advertising company of unlawfully obtaining and selling consumers' personal information, according to a statement issued by her attorney's office."

        "Hariett M. Judnick filed the suit in Marin County Superior Court in California, on behalf of the "general public of the state of California," the statement said.
        The suit alleges that DoubleClick employs Internet cookies to identify users and track their movements on the Internet. The company tracks and records the sites an individual visits, as well as the information transmitted on the sites, such as names, ages, addresses, shopping patterns and financial information."

      • by rtb61 ( 674572 )
        Technically speaking you as the search end user can make better use of personalised search history and refinement of results. Everybody tends to use search phrases and search styles in a different manner, especially in relation to the experience level of the user.

        Searching will only get more and more complex as time progresses and things like automatic language translations finally start to appear. Privacy on one hand or the search engine adapting to your search style, not really as clear cut a choice as

    • Google should not be collecting any of that huge pile of information AT ALL, not just anonymising it after 18 months. As the AOL case showed, search queries can be used to identify individuals even after AOL anonymized them, so it's not IP addresses they are recording, it's PEOPLE.

      AOL did not anonymize correctly. True anonymization would not have queries linked by "userid". Giving you 100 queries and saying "these 10 were made by one user, these 7 by another, etc." is far different from just giving you 10
    • Google could not exist without collecting this information. This data is central to its business model, and key to its differentiation from other search engines. Its history of growth (of individuals choosing to use Google over similar products) validates this approach and also demonstrates that the methodology is generally accepted. The great majority of web uers see nothing wrong with the method even though concerns about it are getting a fair amount of publicity.

  • According to TFA (Score:5, Insightful)

    by ReallyEvilCanine ( 991886 ) on Thursday March 15, 2007 @06:50AM (#18360149) Homepage
    Google plan to make it "more anonymous". Like pregnancy, data either ARE anonymous or they ain't. You can't qualify an absolute, and "anonymous" is an absolute condition indicating lack of information.
    • Comment removed based on user account deletion
    • by Cytlid ( 95255 )
      So you're saying "Data are either impregnated with anonymity or they ain't?"

      I need another cup of coffee.
    • by catbutt ( 469582 )
      Anonymous is not an absolute. That is ridiculous. Like almost everything in the world, there are many shades of gray.

      In this case, it can be determined that a search was within a group of 256 people, but they can't tell which one. What if they just stored the country of the user? Same thing, just larger group. More anonymous.

      There are all kinds of degrees of anonymity. I'm not advocating any side to the issue, but if you are going to look at the issue intelligently, seeing it in simplistic black
  • Stop googling for "jihad death to american president" if you're worried about getting caught.

    I should point out that your google query goes over plaintext HTTP so anyone inbetween can eavesdrop on your queries.

    Tom
    • by solevita ( 967690 ) on Thursday March 15, 2007 @07:08AM (#18360243)

      Stop googling for "jihad death to american president" if you're worried about getting caught.
      You're correct. The only people that demand privacy are those up to no good. How about I come over to your house later, sit in your bed for a bit, go through your draws and your phone records, take some pictures of you and your friends, ask the neighbours some pressing questions?

      If you've got nothing to hide, you should have no problem with this.
      • Ah, the out of context argument. My house is private by the definition that I have locks on the doors and blinds on the windows. Your analogy may make sense if, say, a public walkway passed through my living room.

        I'm not saying people shouldn't have privacy, I'm saying if you export your secrets outside of your domain, you shouldn't expect privacy.

        You don't do your personal finances on a city bus do you?
        • by Dunbal ( 464142 ) on Thursday March 15, 2007 @07:49AM (#18360519)
          Ah, the out of context argument. My house is private by the definition that I have locks on the doors and blinds on the windows.

                Funny - my computer is in my house, behind locks and blinds too. Hey Google's computers also are behind lock and key, and they even have security guards and alarm systems. I don't ever remember giving Google permission to disclose any information shared between them and I - oh and heaven forbid I go around giving away the information Google found for me - I'd get sued!

                Why would the whole world automatically be party to the information Google and I shared one evening? My computer sent that information to a specific internet address, and the answer came back specifically to my computer.

                Not so out of context...
          • by tomstdenis ( 446163 ) <<moc.liamg> <ta> <sinedtsmot>> on Thursday March 15, 2007 @08:08AM (#18360705) Homepage
            This is why it pays to have a modicum of computer knowledge.

            Assuming you're not trolling...

            When you send a query to google, it goes over the "internet" in the clear. That is, not encrypted. Anyone who can see it can read it. Well who can read it? Turns out a lot of people. Between me and google are probably 10 different boxes. 5 of which are just my ISPs routers. The other five are boxes on other networks, not even related to Google.

            There is no inherant requirement for privacy like there is with telephones (maybe their ought to be one). But that said, you're giving your data to Google, willingly no less. That gives them every right to record it. You gave them permission by using their service, I guess you never read their TOS [google.ca] which is your fault, not theirs. Think about the analogy in the real world. This is like you handing your drivers license to every stranger you meet, then getting upset when some of them write it down.

            If you don't want your assets [IP, location, name, platform, etc] leaked to Google you should use an anonymous proxy.

            Tom
            • by kjart ( 941720 )

              Stop googling for "jihad death to american president" if you're worried about getting caught.

              When you use language like "caught" you are obviously not referring to Google, but rather some external agency (i.e. the government) rather than by Google. You are changing the parties involved to strengthen your argument.

          • Funny - my computer is in my house, behind locks and blinds too.

            But your search queries leave the house, unencrypted, with no guarantee of protection and travel to Google. That's where the analogy has fault.
            • by Crizp ( 216129 )

              But your search queries leave the house, unencrypted, with no guarantee of protection and travel to Google. That's where the analogy has fault.

              It's like sending letters without envelopes and demanding that the USPS makes sure no-one can snap up and read the letters while in transit. Or going to the post office in the nude and demand the post office makes sure nobody can see your penis before you get there.

              What? You don't have a penis, and I'm an insensitive clod? Sorry.

        • I'm not saying people shouldn't have privacy, I'm saying if you export your secrets outside of your domain, you shouldn't expect privacy.

          Although really, there is a good argument to be made that people have the expectation of privacy when they use the internet from their own homes, even if it is not technically feasible.

          To use the house analogy, I assume you don't keep your blinds down on your windows 24/7. Wouldn't it feel wrong if someone were using a telescopic lens from 200 feet away and watching your

          • I should point out it's legal to be naked in your home, but not infront of a window where others can see.

            There is a certain question about whether you can use information eavesdropped off the internet in legal proceedings. But that's a question of law, not privacy. If you're worried about privacy, you must keep your secrets to yourself.

            And frankly, you don't have a contract with Google to not log your searches. Add to that your'e doing it over http and it's hard to argue anything else.

            i could see if you
            • But that's a question of law, not privacy.

              Which is why there should be a law to protect privacy on the internet. Law and privacy are not mutually exclusive.

              This isn't simply a matter of reading TOS's. I don't see why we should have to wait for a corporation to offer it to us before arguing we deserve privacy. Again, there is an expectation of privacy for telephone and U.S. mail communication, so why should we throw up our hands and abandon all hope of privacy for the internet?

              • arrg..

                Ok let me explain this to you.

                Even over the phone, you have no privacy. Even though it's illegal to wiretap without a warrant. There is a difference between privacy and "non-admissable in a court of law."

                Imagine you were a spy, and you wanted to communicate with your handler. Would you talk plainly and openly over the phone because wiretapping without a warrant is illegal? No. you'd encrypt the message [codewords, etc]

                And while yes, I think the government should require warrants before wiretaping
                • by o'reor ( 581921 )

                  If google, a party to the communication, decides to divulge the nature of the data, that's their business.

                  Right. So, some day, you go to see you doctor, and he finds you terribly ill. You know the disease will evolve into a really crippling illness, and your health insurance is just about to be renewed. Question : do you mind if your doctor, "as a party to the communication" you just had with him, "decides to divulge the nature" of the disease to your insurer ? Is that "their business" and theirs only ?

                  • You go to Google for medical treatment?

                    Your argument makes no sense, for what you are talking about is doctor-patient confidentiality. As far as I know, there is no such thing as Google-searchee confidentiality.

                    Look, it's this simple. If you transmit your queries, host strings and other info, over plaintext, to a private server, with whom you have no contract, don't assume that the information you transmit is not being seen by other peoples eyes.

                    Tom
      • Re: (Score:3, Funny)

        by Dunbal ( 464142 )
        If you've got nothing to hide, you should have no problem with this.

              Yeah while we're there we can install the webcam in his bathroom and broadcast on the net every time he takes a crap. I have a pair of guys willing to do the commentary on wiping techniques to add to the video...
      • ... How about I come over to your house later, sit in your bed for a bit...

        If you've got nothing to hide, you should have no problem with this.

        It all depends. Are you hot? ;-)
    • Re: (Score:3, Insightful)

      by garcia ( 6573 )
      Stop googling for "jihad death to american president" if you're worried about getting caught.

      Excuse me?! I live in America and if I want to research the results of the search terms "jihad death to american president" I'm well within my fucking rights.

      Fuck you for saying otherwise.
      • Re: (Score:3, Interesting)

        by tomstdenis ( 446163 )
        Well you're describing a law enforcement problem not a privacy issue.

        Google is within their rights to gather as much information as you feed them (your ip, time of day, host strings, query string, etc).

        My point was if you were planning on committing crimes, you shouldn't use google to find tips.

        Tom
        • Google is within their rights to gather as much information as you feed them (your ip, time of day, host strings, query string, etc).

          I see the problem now; you clearly don't understand the extent of Google's monitoring. They're not logging just IP address', they're logging people. The AOL data that came out showed how you could follow tracking cookies to see exactly what people, not IP address', were searching for.

          I don't see why you have such a problem with it anyway. Many people around the world asked fo

          • Re: (Score:3, Insightful)

            by tomstdenis ( 446163 )
            I'm not against google cleaning their logs. I'm against people claiming this is a privacy issue.

            Google logging all your queries: Not a privacy problem.

            Bank leaking your SSN via stolen laptop: Privacy problem.

            AOL knowing that you like midget porn: Not a privacy problem.

            Government using sub-standard contractor to manage passport data, later turns up on broken into computer: Privacy problem.

            By screaming wolf every time "data" is mentioned you desensitize people to real privacy problems.
  • I bet that means the IAO has their project running properly now so they no longer need to use Google Logs ...
  • by Anonymous Coward

    After previously fighting turning over search data to the feds, it looks like they are striking another blow to the "think of the children" crowd.

    Anybody who remembers this incident probably also remembers the article 'Google in bed with the CIA' too:

    "Google was a little hypocritical when they were refusing to honor a Department of Justice request for information because they were heavily in bed with the Central Intelligence Agency, the office of research and development," said Steele.

    http://www.prisonplanet.com/articles/october2006/2 71006googlecia.htm [prisonplanet.com]

    Makes me wonder how fast does the CIA anonymize their material? Ha!

    • Yep, I love how we hear all this great theatre about Google "not being evil", "fighting subpeonas" and "anonymizing search records" while at the same time they become more firmly embedded in the US spy services. What else would one expect from a business that is (according to another poster) "primarily a media company, like NBC"?

      Here's a quote from William Colby, former Director of the CIA:
      "The Central Intelligence Agency owns everyone of any major significance in the major media."

      Plus ça change...
      • Thats the way its supposed to be. If the CIA didn't own people then they would have to be shutdown for negligence.
    • by Goaway ( 82658 )
      And what more reliable source could one image than a 9/11 conspiracy theorist?
  • "you're gone" [you are]
  • since that data could be abused in any number of ways, including credit scoring, insurance scoring or leaks of "interesting details" to the press. Probably those would hurt Google's reputation more than any additional income it could generate, but it's still the better policy.
  • If you're worried about privacy, I recommend Firefox [getfirefox.com] and the Customize Google extension [mozilla.org]. I'm also a fan of Googlepedia [mozilla.org].
  • 18-24 months? (Score:2, Insightful)

    Which is it? 18, 19, 20, 21, 22, 23 or 24?
    • It might not be a fixed number at all, in which case their estimate is as exact as they can be without going into explicit detail.

      I'm guessing they will have a new process, executed every 6 months, which anonymizes all logs older than 18 months. How long would any given search remain non-anonymous under that approach? 18-24 months.
  • Comment removed based on user account deletion
    • by Alascom ( 95042 )
      >That's still far too long and is most likely motivated more by logistical concerns in
      >retaining so much data than out of any act of benevolence. However it definately makes
      >good PR to paint this as 'Taking steps to improve privacy'...

      I am sure that while statistical analysis is one possible use, another use is fraud prevention. Google makes money off each search query. However, there are people who try to scam the system using adsense and adwords programs and keeping a year or two worth of data
  • by guanxi ( 216397 ) on Thursday March 15, 2007 @07:59AM (#18360605)
    To quote them:
    "It is difficult to guarantee complete anonymization, but we believe these changes will make it very unlikely users could be identified."

    "Changing the bits of an IP address makes it less likely that the IP address can be associated with a specific computer or user. Cookie anonymization makes it less likely that a cookie can be used to identify a user."

    "[I]t's possible that data retention laws will obligate us to retain logs for longer periods."

    "How many subpoenas for server log data does Google receive each year?
    As a matter of policy, we don't provide specifics on law enforcement requests to Google."


    I don't think it will mean much unless they publish their anonymization technique. Even Google seems to have doubts about it, and considering the resources of some attackers (e.g., national governments), if the anonymization can be broken it will be.

    But Google's anonymization does not have to be perfect: Google isn't the only place your google.com activity is recorded: There's your personal computer, possibly your ISP, other sites (referrer links show Google search terms), etc. As long as Google makes their anonymity difficult enough to break that it's significantly easier to go elsewhere for the information, they've done their job. If you need to be anonymous, I hope you are taking other steps.

    I, for one, welcome the merciful intentions of our benign new overlords.

  • Didn't AOL get into a lot of trouble for this?

    Personally... we knew this was going to happen. Anyone that's surprised is a fool.
  • by WED Fan ( 911325 ) <akahige@NOSPaM.trashmail.net> on Thursday March 15, 2007 @08:32AM (#18360973) Homepage Journal

    List of nifty little phrases that have bitten their speakers in the ass:

    • They will never bomb Berlin
    • Read my lips, no new taxes
    • I did not have sex with that woman
    • Mission accomplished
    • Don't be evil

    Now Google brings us:

    Let's just be less evil, now that we've been caught.

  • The 'think of the children crowd' should be very pleased by this - children search for sketchy things all the time... and then their parents get blamed for it.

    'Twould be better if it all stayed anonymous, in my opinion
  • Personally I think it's all a load of BS. If they really cared about our privacy, and if all they really needed my IP addy for is to aggregate my searches to 'better serve me', then all they have to do is one-way hash my IP addy. Then they can still tie all my searches together, and my gmail and such, but they wouldn't be able to back track it. And the govn't could demand all they want... you want the IP of the user who searched this? Here it is Mr. Bush... go nuts: x867:%dsgfk435j>67&*g[fg

    So forg
    • by santiago ( 42242 ) on Thursday March 15, 2007 @10:39AM (#18362785)
      There's 2^32 IP addresses under IPv4. If Google is doing the hashing, then they know the hash function. How long do you think it would take them to brute-force break the hash by hashing every possible IP address and creating a map from the hashed values back to the originals? Express your answer in microseconds.

      (If your solution is to increase the space of inputs by adding a variable salt value, please explain how this allows them to use the resulting hashes for aggregation.)
  • 127.0.0.1 (Score:4, Funny)

    by supun ( 613105 ) on Thursday March 15, 2007 @09:49AM (#18361871)
    Just hard code the function that grabs "HTTP_REMOTE_ADDR" to return "127.0.0.1." That way the feds will think all the kiddie p0rn searches came from the computer they are using.
  • Why is Google getting any favorable press at all for this? They never should have been doing it in the first place.
  • There is absolutely no reason for them to retain logs linking searches to IP addresses for even 18 seconds, let alone 18 months -- this isn't "improving Google" for any of their users, no matter how much they claim it is.

    Keeping search history for logged-in users is one thing; I can see how some users could find that useful, just like browser history autocomplete. Perhaps they want to keep logs of non-logged-in users around for something like geographical targeting, but there's no reason they can't process

  • Why don't they just not save search data in the first place?
  • Google is gathering a huge trove of informaiton about us and this shows it is not anonymous. Search is only part of what they have. The more Google services you use the more you let them build a very detailed profile of you. And the more you do that the less privacy you have.

    They know what you search for, who you IM and email and about what, where you have appointments and what you bought. You essentially have no privacy.

    If you value your privacy do not use any single provider and spread your searches,

"He don't know me vewy well, DO he?" -- Bugs Bunny

Working...