EU Recommends Slashing Search Data Retention 93
Wayland writes "The European Union's Article 29 Working Group has completed its PDF report on data protection and search engines. The group recommends that search engines only be allowed to hold onto search data for six months. 'To hang onto data for longer, search engine operators will need to show that such data is "strictly necessary" to offer the service. Google and others have long said that they need to retain data in order to refine search results, prevent click fraud, and launch new services like spell check (which, in Google's case, was built from user search data). In addition, the data that is kept will need to be guarded more closely. The working group concluded that IP addresses could be used to identify individuals; if not by the search engine itself, then by law enforcement or after a subpoena.'"
Tracking and identifying a piece of data. (Score:4, Insightful)
In six months you can intermingle the data items so much there's no way of proving you're actually storing the data and you'd still have what you need of that data.
How does law track the identity line of a data item? Data has no memory and leaves no trace.
DataProtection Act (Score:5, Informative)
Briefly, so long as data is personally identifiable you must show that you are not retaining it longer than necessary. If I summarise or analyse data and remove information which makes it personally identifiable - names, addresses, telephone numbers, email accounts - then it is not covered.
IMHO the US stands in need of a Data Protection Act, as an amendment to the Constitution. The present Adnmninistration seems to be looking for ways of keeping track of its citizens which avoid the Constitution. Technically in Europe it is probably illegal to send personal data via GMail - because it is exporting it to a country that does not meet European standards for personal data protection.
Re: (Score:1)
Re: (Score:1)
seems to be looking for ways of keeping track of its citizens which avoid the Constitution. Technically in Europe it is probably illegal to send personal data via GMail - because it is exporting it to a country that does not meet European standards for personal data protection.
Shouldn't it be illegal to send someone's (such as a customer's) personal data on gmail since:
Re:Tracking and identifying a piece of data. (Score:4, Informative)
'Personal' information is any information that can be linked to a person. This can be an (IP-)address, phone number, birth date and other data that is generally seen as being personal, but also information like the URL's visited by a person, or the e-mails sent to a person. The 6 months start counting as soon as a system no longer absolutely needs the data for its day-to-day operation.
As an example, http-logs showing which ip-address visited what URL can maximum be retained for 6 months. If you send out snail-mails to a bunch of subscribers, then you are obligated to delete the address of your subscriber maximum 6 months after he unsubscribes (or after he dies). If you still need the personal data (e.g. you need people's addresses to be able to send them invoices as long as they still have a contract with your company) then you are of course allowed to store that data. It also means that any statistics that you need to make on customer related data, will have to be made before that data is deleted, and the statistics cannot contain any information which would allow them to be tied to a person.
Another part of the data protection law mandates that a person has to be informed of every storage of his personal data, and has to right to look into that data and update it if there's errors in it.
All in all, the law ensures that Europeans can be pretty certain that their (online) privacy isn't invaded (as long as they surf only European websites).
Re: (Score:1)
Re: (Score:2)
How does this affect Google Groups, which archives the last ~20 years of Usenet messages? One of the oldest messages I ever posted is still stored there, from 1988, and includes my real name, email account, and Usenet-style IP address. Will my personal data be erased to comply with E.U. law, but the text still preserved? Or would the whole thing be erased?
It would be a shame to see that piece of personal history disappear because of some poorly-worded law.
Re: (Score:3, Informative)
How does this affect Google Groups, which archives the last ~20 years of Usenet messages?
Doubtful it would have an effect, as:
(1) It would be making a law retro-active (with respect to historical documents)
(2) It is implicit in usenet that this information is being published and is made public (Ignorance is no excuse, one could say). Usenet is a public forum.
IANAL of course, so who knows, but common sense and common knowledge of the way laws are enforced in the West leads me to believe that usenet should not be affected by data retention laws. I will emphasize; publishing to usenet means publi
Re: (Score:3, Informative)
I will be tactful in saying that there is some logic to your question (a bit naive from my perspective, I must admit), but I think such questions should be asked. Sometimes the best of us miss the obvious.
Best regards,
UTW
Re: (Score:2, Interesting)
This isn't SE-exclusive (Score:3, Insightful)
Re:This isn't SE-exclusive (Score:5, Insightful)
Re: (Score:1)
That's a good point, but I would think emails and chat logs are beyond the jurisdiction of this EU article, it seems to be search engine and search queries specific.
It's true naturally that Google has access to a lot more data than a single random website has, however, no matter how small it is, it could quite easily expand the information gathered on the people who visit. For instance, in some case, by googling an IP, you can see through publically available stats which other sites that one IP has visite
Re: (Score:2)
Re: (Score:1, Interesting)
Re: (Score:2, Informative)
Re: (Score:1)
"strictly necessary" (Score:4, Insightful)
If that is the law to follow, they will make it "strictly necessary" by adding features using that data, I guess. Just making it a bit harder is a lot of lawmaking for little effect.
Re: (Score:1, Interesting)
It's one thing to require anonymization of data (which I find reasonable). However it's quite another to say you've got to delete the da
Re: (Score:2)
RTFA, lemming (Score:5, Insightful)
RTFA, lemming. The summary _again_ is inflammatory crap, yes, what else is new? But that's not what TFA says.
They're _not_ required to delete data completely, they're required to delete data that can identify you personally. Like IP, grouping between those searches, etc.
They do _not_ need that to refine their searches. If I search for, say, "Oracle auto-tuning", that's that. I expect the same result regardless of what my IP is, regardless of whether I searched for "WebSphere XA configuration" before, or "Fluffy tail buttplugs" or whatever. You can tune the search with just the search string. You don't need to track me for that.
_That_ is the friction between the EU and Google: that Google wants to keep that kind of identifiable information like the pair of IP and timestamp. Google has been playing bullshit handwaving games along the lines of "but we really need the IPs", then "but some people change IPs, so it won't identify them for ever", then "wait, would it be ok if we changed a bit or two of the IP?" along with a good helping of "but we'll keep it for 18 month before changing those bits anyway!"
And seeing Google protest at every step when they're told to stop tracking google, and, yes, exactly such bullshit fallacies as that they really need that IP to refine the search algorithm... is kinda funny. I guess "do no evil" was for when they were small and cuddly. Now that they're the 800 pound gorilla of the online advertising market, heh, turns out that they get as big a boner as any other PHBs out of trying to rape people's private data for a quick buck.
But, hey, I'm willing to be educated. _You_ tell me how deleting the IP information is gonna make search engines tank. Exactly which search algorithm relies on knowing my IP? No, seriously.
They can keep their statistic history for as long as they want to, but they can't keep your personal data. It's that simple, so let's stop handwaving strawman scenarios. They can (and should) keep information like "Shares of Moraelin Buttplugs Corp peaked at 1.50 Euro a share last year." But they have no reason to retain info like "Freddy Krueger lives on 22 Elm Street, and bought 2 shares of Dr Kevorkian's Suicide Clinic last year," just because he bought those 2 shares last year.
A financial advisor's or stock broker's job is to trade on the stock market. It's _not_ to collect your personal data and sell it to the highest bidder. It's not their job to data-mine your private information. It's that simple: stick to selling those shares.
Mind you, even for data mining, there's a fine line between information and trivia. Stuff like "which team won the most games last year" is information. You can make an informed prediction for this year based on it. Stuff like "which team won the most games on a Wednesday, in rain, under artificial light" is trivia.
Similarly, "people from Germany buy more economic games than those in the USA" is information. Stuff like "people living on odd numbered houses, and on streets whose name ends in a 'e', and are born on a rainy thursay, buy more economic games" is useless trivia.
"50% of the gamers are between 25 and 50 years old" is information. You can decide a target demographic based on that. "People born on a Tuesday the 14'th have the most gamers, at a whole 0.01% of the total" is trivia. Even if you figured out how to make games especially fit for people born on a Tuesday the 14'th, it's too thin a slice to individually bother with. Etc.
Going too deep into details, slices your data too thin, and produces meaningless trivia.
There simply is _no_ sane justification for the kinds of personal information that especially the USA PHB's try to collect. Other than spamming you personally
Re:RTFA, lemming (Score:4, Interesting)
While your post was very informative, the best I can tell it summarizes to is that google has no reason to keep individual IP data because such data is useless for anything other than marketing and selling to other people.
So, with that in mind, and not taking a stance on whether it's still too personal even with a good reason, lets look at some data mining techniques.
Say for example, you have a region of the midwest united states, the exact middle of the bible belt. For those unfamiliar with the term, that means a place where the christianity is high, and preached loudly, often, and to anyone within earshot, and the ability to be nonchristian is relatively low. But say you have this group of people, and a lot of their searches are of religious material. You could use them as sort of a "expert" group, giving a little more weight to their likes and dislikes as a whole to adjust pagerank for their area of study, religion. This allows for pages that may be far down the list, but accurate and factual, to be pulled up a bit so the rest of the world might find them, and if they truly are good, then they'll stay up there afterwards.
If not, then the page will drop back down in rankings again and it will have earned the low rank it has.
You could not do this without some form of IP/region tracking, and it increases with the accuracy you track IPs. If you track single people, you can get more meaningful data, for example, you can narrow your "expert" group to, for example, pastor brian, sister marian, and sister margarette, and leave out their neighbors druid matterson and buddhist huy ngyen.
This decreases false positives from your expert group and also allows you to more refine where each person might have a good sense of judgement.
That hopefully explains the IP section a bit.
As for timestamps, I only have two theories about them, and both are equally likely.
The first of which is the timestamps are used, in combination with the search terms, to help them optimize the load balancing they use. Since i'm sure they cycle systems onto and off the grid the internet uses, as the systems rebuild databases or do maintenance, you could use such data to tell for example, when you could most likely take the Yak-Yodeling server offline to re-do it's database and crawl pages, and have people get search results from a slightly out of date backup, and minimize the impact from it.
The other option is that there are some results that are time sensitive. Without linking IP data to geographic data, if you notice that an ip range searches for "resturaunts" + dinner at a certain time of day, and you get a search for resturaunts, you might give preference for dinner selections at that time of day, because you could assume they are looking to go eat.
Anyway I hope that clears up a bit on how such specific data is usable and important. Could it be usable in other forms that didn't identify IP? Probably. but it would serve no practical purpose, because as long as they have some system for converting an IP to a unique identifier to identify a group of searches, they will always have a way to reverse or bruteforce the originating IP, given the time and interest on the half of whoever wants it.
Duly noted, but... (Score:2)
1. If they can do that kind of optimizations (which I highly doubt), then make it opt-in, not impossible to opt-out. If Sister Marion wants her bible searches optimized that thoroughly, or High Druid Matterson wants his searches free of the neighbours' puritan crap, then they can register and log in, you know.
There you go, it's an even easier way to track people who want to be tracked. More importantly, it's a responsible implementation. Just because you can do something, is no
Re: (Score:1)
Say for example, you have a region of the midwest united states, the exact middle of the bible belt. For those unfamiliar with the term, that means a place where the christianity is high, and preached loudly, often, and to anyone within earshot, and the ability to be nonchristian is relatively low. But say you have this group of people, and a lot of their searches are of religious material. You could use them as sort of a "expert" group, giving a little more weight to their likes and dislikes as a whole to adjust pagerank for their area of study, religion.
Who (at Google) is going to decide which regions of the world contain the most "experts" on each topic? That sounds like a lot of work. A far better approach is to optimize pagerank for a particular topic -- such as religion -- based on all searches for that topic. Chances are people using the same search terms are looking for the same thing, regardless of where in the world they're searching from. If they're searching for something different, they should be using different search terms.
Re: (Score:1)
"They do _not_ need that to refine their searches. If I search for, say, "Oracle auto-tuning", that's that. I expect the same result regardless of what my IP is, regardless of whether I searched for "
Google (and all search engines) are all interested giving you a more personalized set of results tailored to you specifically. If you search for "Oracle" right after you search for "delphi greece", you're probably looking for something different than if you just searched for "RDMS".
Re: (Score:2)
Thing is, at what cost? The way they want to customise your "search experience" is by collecting personal, trackable, invasive data about you. Consider another case. You look up in order; online handgun ordering, bank hours, and then Google maps of banks near your house.
Are you planning a bank robbery? Google might think so--if the
Re: (Score:2)
Google actually uses IP data to localize information in searches intelligently, including suggesting search terms via typeahead. So their desire to store IP address is actually somewhat valid.
Re: (Score:3, Insightful)
Re: (Score:1)
Why do so many people feel the need to get the State's guns involved in people's voluntary transactions
Re: (Score:2)
But again, in most cases they have no business to be doing even that data mining in the first place. A financial market _can_ function without knowing the exact address and birthday of everyone who ever used their service. And Google _can_ refine their searches without trying to track who did them. Etc.
One of the services provided by Google is the ability to recommend pages or searches you may be interested in. For example, if you do a Google search for "Semiconductor Of Moscow", it will recommend a search for "Conductive Paronite" because those two topics are semi-related based on searches by other users. For this level of detail, Google keeps very detailed information concerning searches, in the same way that Stumbleupon tries to recommend other sites available on the internet.
This may be trivia to
Re: (Score:1)
Not the case with Google. It may
Re: (Score:2)
First, I'm not defending Google's right/desire to keep any of this data...
That said, you are wrong. Search results are not idempotent, they chang
Re: (Score:2)
Re: (Score:2)
One would think that this will be considered in a law. Would the new feature in the service be strictly necessary? Should it be separated from the basic service? I would not suggest for companies to try to circumvent laws like this, as their intentions would not be friendly looked at.
We're SOL (Score:2)
And the way it's looking, law makers are dragging their feet on this type of thing just so the government has this massive grey area to work in. But, then again, I'm just at the bottom looking up. Perhaps they see it differently from their angle.
Re: (Score:3, Funny)
Really I don't mind (Score:1)
Re: (Score:1)
But the vast majority of those people, unlike you, have no clue as to what that entails. They don't even know what an IP address is. I think you're very optimistic to rely on them to provide overwhelming demand or lobbying power to preserve what you want.
Re: (Score:2)
Re: (Score:1)
Thanks for the reminder, I heard about this a while back but had forgotten - I've just added 'rm -rf .macromedia' to my .xsession file, that should wipe out Flash cookies once per login on Linux.
Privacy-conscious search engines? (Score:3, Interesting)
Re: (Score:2, Funny)
Cryptonomicon, by Neal Stephenson.
Re: (Score:2)
Re: (Score:2)
Re:Privacy-conscious search engines? (Score:4, Informative)
For example, Facebook was immune from investigation into what they were doing with personal data. The established a London office (to sell adverts to EU people) and then they were investigated.
(Of course, Google could still keep the data of everyone else. It depends if it's easy for them to do this -- it probably is.)
Re: (Score:2)
That said, right now security-paranoid geeks (or rather, anonymity-paranoid ones) probably use tor, which is more than fast enough for searches, or scroogle( http://www.scroogle.org/cgi-bin/scraper.htm [scroogle.org] ) which is good enough most of the time (it makes the query for you, so google only see their server).
As for finding an ideal country to set up servers, thepiratebay were thinking about buying an island and declaring it an independant country.
That
Re: (Score:1)
I know of one that *claims* to protect your privacy:
ixquick.com
who are being watched by ECHELON instead.
That is, big brother is either watching you with his right eye or with his left eye, but rest assured he is watching the hell out of you.
How come EU is always more consumer-protectionist (Score:5, Insightful)
EU seems to protect its citizens and consumers from the rapacious hungry corporates more than US, as beacon of freedom, does.
Whether it is kicking Microsoft's ass all the way back to US, or
Forcing Apple to unblock its iTunes service in France, or
Cheaper medicine and medicare that keeps the private insurers at bay, or
Privacy laws and zealous courts (in germany) that force the government to disband its secret spyware projects, or
Libel laws that force newspapers to pay huge penalties to citizens for reckless lie mongering about their private lives, or
Airplane laws that force airlines to pay financial compensation to passengers for ditching them, or
Laws that jail CEOs and even the board for criminal conviction of corporations,...
While US zealously preserves corporate rights and treats them above human beings, allowing and authorizing torture, etc.
How come the so-called stiff-lip society values human freedoms so much, when the so-called Beacon of Democracy incarcerates its own citizens without trial.
And that too many EU nations don't even have constitutions that embody something like our First Amendment, etc.
Re: (Score:3, Insightful)
Re:How come EU is always more consumer-protectioni (Score:4, Interesting)
I always laugh when I hear Americans talk about 'liberals' as being left-wing, given that that particular ideology is generally regarded as being at the centre of the political spectrum in Europe.
One of the things I notice on Slashdot is that there's a backlash whenever a government ever tries to legislate, especially when it's the EU trying to improve consumer protection - the general idea being that they should keep their collective noses out of other people's business.
I find it odd that Americans (as Slashdotters predominantly are), whose society prides itself on being democratic, would rather take power away from their democratic institutions and hand it to undemocratic corporations. The free market theoretically exists to control the amount of power that a corporation can accumulate, but I've found that Slashdotters oppose state intervention even in instances where the free market does not operate properly (i.e. monopoly situations).
It could be that this is because the US electoral system doesn't perform as it should. The usual example I use is the US Electoral College, where the presidential election is skewed by the first-past-the-post system used entirely out of context, and is provided for by the constitution. In cases where the electoral system is flawed, why should you trust a government any more than a corporation?
The GP mentions the issue of EU countries' constitutions - I live in the UK where there is no constitution, and ultimate power is invested in parliament, which makes it much easier to dispose of anachronisms in our voting systems.
Of course I might be on the wrong track entirely. It occurs to me that the most common sense I ever hear from politicians comes from two places: the UK House of Lords and the EU Commission - both unelected bodies. It's possible that politicians are more able to act in the public good when they don't have to worry about the next election.
Re:How come EU is always more consumer-protectioni (Score:2)
Consumer protection laws have been in effect in europe since the middle ages, and are taken seriously by all parties. This is probably due to a tradition of draconian punishment for repeat offenders.
Re: (Score:1)
But somehow our laws seem to punish the individual crimes, rather than the larger crimes by corporates or the government itself.
Why have we not had a CEO being jailed or hanged on behalf of a corporation which was criminally convicted of manslaughter like Exxon or the Bhopal Gas Tragedy in India?
Take for instance the zealousness of German judges in convicting Volkswagen execs: http://www.csmonitor.com/2008/0325/p12s01-woeu.ht [csmonitor.com]
Re: (Score:2)
Re: (Score:1)
Compare this to an average around 0.1%-0.2% in Europe, and an even lower rate in Japan. (and for an example from a perhaps more comparable nation, look no further than to your neighbor to the north)
Re: (Score:2)
As a fellow European I think that Verkkokauppa was absolutely correct in their response. Why didn't you check that the computer was suitable for linux before you bought it? They sold an item (i.e. the computer) for a specific purpose (i.e. to run the software that it contained). They made no promises about its suitability for running linux, or for making toast, milking cows or ensuring that the railway network runs on time. If you want to use it for any of those purposes it is your responsibility to ens
Re: (Score:1)
Re: (Score:2, Interesting)
Any Americans fancy a citizenship-swap?
Re: (Score:1)
Take Fox news for instance: Delibrately calling Obama as osama, The Swift Boat Veterans for Truth incident...
and these are just samples where i wish the ruthlessly effective libel law of EU applied in US.
Imagine the SBVT campaigners for bush being arrested, handcuffed and convicted to 3 months in prison.
Or Fox news CEO being convicted of Libel and forced to resign and pay huge compensations.
Or the bald idiot who exposed the CIA agent being forced to serve 18 yrs in prison....
At least in England which is the epitome of strong libel laws none of the examples you quoted would apply. Politicians generally don't sue journalists for publishing lies about them. A quick look at the English tabloids, particularly the ones owned by the people that own Fox news would tell you that. I suppose Jeffrey Archer is an example of a politician suing a news
Re: (Score:2)
I chose the name purely to mislead NSA.
-:)
Re: (Score:2)
That'd probably be because money talks in the "land of the free". Everyone is free to make their fortune, it's the "American dream", and so corporations have power. Add to that the fact that we've had centuries to do it wrong (e.g. previous feudal systems, which would have had a leaning towards supporting the nobility) and somewhere like America is bound
Re: (Score:2)
Here in the Netherlands we do have a written constitution but also more fundamental unwritten constitutional law in some areas. The argument often made against codification is that codification disentrenches it, making it possible for parliament to mess it up, and reduces its normative force. A quote from
Re:How come EU is always more consumer-protectioni (Score:5, Interesting)
In most European countries ( and in effect the EU itself ) there is a plethora of political parties that are likely to come into power. With so many competing parties there is a large chance at least one of your competitors will point out your shady behavior, and it is thus easier to try to outdo them in positive ways rather than malicious ones.
In contrast, in the US the entire electoral system more or less favors a two party system, where the winner takes it all. In such a system you gain a lot by attacking a single enemy. If you're a democrat all you need to do is to break things for the republicans, and vice versa. Such tactics don't work if you have 5-6 potential candidates because if you try to fuck over 4 of your opponents you run the risk that they will conspire against you. The american system is very easily corrupted since once you have influence with the two main parties there is little to stop you, while gaining control of a 6-7 party parliament without anybody crying foul is more tricky.
Simply put, in the EU political parties compete for power, in the US there is more of a cartel or monopoly. You can also notice these trends if you look at individual EU countries. Britain has more of a one party system, and consequentially their politics are a lot more "american" than many other European ones.
It is also rather possible that the EU is merely better because it is relatively new at the moment, and that with time it will become corrupted as third parties learn to manipulate it. Time will tell...
Re:How come EU is always more consumer-protectioni (Score:2)
That's because these corporations are the EU's rivals for that control. The EU prefers to keep that kind of control for itself [slashdot.org].
Re: (Score:1)
Re: (Score:2)
I mean, US has always stated to other countries that they MUST follow WTO rulings and has used thugs to impose the same (eg Panama).
But somehow it fails to practice the preaching.
Isn't that odd?
Take for instance Antigua, a small country which poses no threat to US. WTO ruled in its favor against US and ordered US to pay restitution. What did US d
Re: (Score:1)
The US decided to prevent its citizens from using online gambling sites situated outside the US. Obviously noone else is going to like that, especially when the US has treaties saying they won't do that.
The state vs freedom of information (Score:2, Interesting)
-A
Re:The state vs freedom of information (Score:4, Insightful)
It is not the state controlling access - it is the state, acting on my behalf, to ensure that large organizations (including the state itself) are not entitled to use my personal information against me. If you are not covered by such protection then anyone can use your information to do you untold damage and there is nothing you can do about it.
Re: (Score:2)
Am sure google would demur and state the info is being "held" in US servers and hence they are not answerable to you.
Secondly i guess you would be laughed out of the office and asked to sue.
Re: (Score:2)
The office may say "well sue us", but just because they're an American corporation saying that and they think they're above the law doesn't mean they actually are.
Chances are they'll say something about needing to know IP addresses or have a copy of your Google.com cookie and then take 40 days over it. BT (British Telecom, my phone provider) are currently saying
Re: (Score:2)
EU Recommends Slashdotting? (Score:2, Funny)
Interpretation in the US (Score:1)
Not who, what (Score:1)
It isn't "how long it is kept for" but "what is kept".
Storing search criteria forever is only an issue if it can be used for identification or reveal information about someone.
So strip the "who" and keep the "what". And you can ditch the "who" part of the data immediately for the majority of people, and let people opt-in if they want history relevant services.
On a side note (Score:1)
Google's spell checking is the best.
It is far more reliable than the majority of spell checkers, it almost always knows people's correct names, cities and states and principalities and so on and so forth.
Raises the question ... (Score:1)
AFAIK, Google's situation is different because it has a physical presence in Ireland and Belgium, and so they probably will have to eventually conform to the regulations in question. In principle, I like the EU personal data protection laws, but in general I believe that people need to protect themselves by controlling their own behavior rather than the actions of others.
[setenv
Terms of use (Score:1)
All that being said, I, too, dislike their lengthy data retention policies, but I continue to use their services. Oh well.