Anonymous Cowards, Deanonymized 159
mbstone writes "Arvind Narayana writes: What if authors can be identified based on nothing but a comparison of the content they publish to other web content they have previously authored? Naryanan has a new paper to be presented at the 33rd IEEE Symposium on Security & Privacy. Just as individual telegraphers could be identified by other telegraphers from their 'fists,' Naryanan posits that an author's habitual choices of words, such as, for example, the frequency with which the author uses 'since' as opposed to 'because,' can be processed through an algorithm to identify the author's writing. Fortunately, and for now, manually altering one's writing style is effective as a countermeasure."
In this exploration the algorithm's first choice was correct 20% of the time, with the poster being in the top 20 guesses 35% of the time. Not amazing, but: "We find that we can improve precision from 20% to over 80% with only a halving of recall. In plain English, what these numbers mean is: the algorithm does not always attempt to identify an author, but when it does, it finds the right author 80% of the time. Overall, it identifies 10% (half of 20%) of authors correctly, i.e., 10,000 out of the 100,000 authors in our dataset. Strong as these numbers are, it is important to keep in mind that in a real-life deanonymization attack on a specific target, it is likely that confidence can be greatly improved through methods discussed above — topic, manual inspection, etc."
First (Score:5, Funny)
Re:First (Score:5, Funny)
Got you! Using the power of de-anonymisation, I have discovered there you are none other than...
Bicx! [slashdot.org]
This stuff really works.
Re: (Score:1)
Still, without other forms of authentication, it's just an educated guess.
Take a community like Slashdot, for example - I see somebody write an interesting or witty phrase, and add that phrase to my vocabulary, repeating it in another discussion later. Does that make me that person? Now imagine 20 or more people doing the same.
-- Ethanol-fueled
Re:First (Score:5, Funny)
Then I wouldn't want to do any different.
-- Ethanol-fueled
Re: (Score:2)
Presently, picture, in you're head, graeter then 20 folks copying you.
Gah... there gunna git me every time.
Comment removed (Score:5, Interesting)
Re: (Score:3)
Bah!
Who needs software when we've got at least two dudes here who can identify hundreds of folks known as the Great Bonchime, or something like that....
cheers,
Re:First (Score:5, Interesting)
And easily-defeated. One of the projects of my senior class at university was the building of software to defeat that kind of detection. It was crafted primarily so dissidents in foreign countries could speak without fear, by analyzing the author's writing patterns, and offering solutions to shift the writing to a different style.
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
You mean formerly anonymous?
Re: (Score:2)
FTFY?
Re: (Score:2)
Perfect.
Re:First (Score:4, Funny)
Anyway why do you cower? what are you afraid of?
Wait a minute...
Re: (Score:2)
Imagine thousands of accounts doing the same thing then slashdot = stagnated. Anyway why do you cower? what are you afraid of? Wait a minute... ;)
Of course, that raises the question of whether the multiple "Michael Kristopeit" accounts are actually someone called Michael Kristopeit, whether they're pretending to be someone else with that name, or whether the name was randomly chosen and any resemblance to any real "Michael Kristopeits" is purely coincidental. :-)
Re: (Score:2)
hmm what villain wants to make possible one of the most necessary elements for mankind to live free, you say.
K, I'll bite.
Smith & Wesson?
Oh, wait, wait, I got it, no...I had it, no, wait wait wait... Larry Flynt !
Gotta be Larry although you gotta hand it to Hitler for the Volkswagen.
But I do get that you see this clown as having sold out humanity for whatever his reward for this was.
Re: (Score:1)
and these pseudorandom mispellings will make you unique, because everybody has another choice of "random" misspellings.
Re: (Score:2)
Re:First (Score:5, Interesting)
Comment removed (Score:4, Interesting)
Re: (Score:3)
Hell why even post at all if nobody is gonna understand you clearly?
Well, this seems to work for about 1/2 the comments on slashdot! :D
Re: (Score:3)
Re: (Score:2)
It would butcher some flows and really help others.
We will all meet in the boring, anonymous middle of mediocre writing.
Re:butcher the flow (Score:2)
Gem from a lost soul in my childhood:
""What was it when for you said there was maybe like a lot of there but there wasn't and you knew it?"
Re: (Score:2)
Re: (Score:2)
At least you're coherent!
Sounds also like the Jamaican dialect among the seasonal workers in my area. I'll sign off with:
"Me ha' go' way now co' me ha' simtn' fi' fix eh moi sumtn' fin yaum." /. lingo.)
(aka "Catch ya later, I have to go fix a bunch of $hit and I'm fscking hungry." in
Re: (Score:3)
Re: (Score:2)
There, RTFY (Reanonymized That For You)
Re: (Score:2)
Re: (Score:3)
The Yoda mod?
Re: (Score:2)
Tell us apart you cannot.
Re: (Score:2)
Anonymous is way ahead of you, to anonymize their writing style they all speak in memes :-P
"Afro-men at the habbo pool"
"Angry dad, when we done goofed"
Re: (Score:2)
Re: (Score:2)
Darmok and jilad at slashdot.
Re: (Score:1)
This just begs a "reanonymize" browser plugin to alter one's writing style...
All one gots'ta do be to run some sample uh speech drough de Dialectizer, conveniently located at http://www.rinkworks.com/dialect/dialectt.cgi [rinkworks.com], so cut me some slack, Jack.
One wouldn't realize dis, but ah' have some university educashun and am highly fluent in de Enlgish language.
Now ah' can sound likes I'm de average slashdot eyeballer. Ah be baaad...
Re: (Score:2)
First!
Analyze this anon comment, suckers!
Kristopeit, is that you?
Re: (Score:1)
I found this lying on the floor, perhaps you need it? <a href="
Re:Not cool. (Score:5, Insightful)
Re: (Score:2)
Re: (Score:1)
Being able to speak your mind without consequences requires privacy, and I think it's beneficial to have that option (and not to pretend you're talking without anonymity). For example, in real life, I wouldn't bother arguing with your point because there's no benefit to me doing so, and a
better way. (Score:5, Interesting)
This is, of course, not really new.
A couple of years ago, there was some news (cannot find the link now) that some researchers tried this with a more statistical approach. As an implementation they used a compression algorithm.
I had a try with this on a forum. Somebody posted a long story anonymously, but I suspected the author. I gathered 10 posts from 5 authors, including the suspect. Then I cut the amount of text to equal length. Subsequently I added the anonymous text to each of the 10 samples and bzipped the resulting text.
The resulting zipped file was shortest in the case where I added the unknown text to the samples from the suspected author. The bzip algorithm apparently decided there was more similarity between the posts.
Although this was by no means a real scientific test, I turned out to be correct and was rather pleased with the result. Seems to me such an approach could also be useful for things. Why login on /. when it can just figure out who you are based on what you have just written?
To maintain anonimity you would just have to insert random shit into your posts.
Bonus points for the slashdotter who can deduce my identity based on the non-randomness of this post.
Re: (Score:3)
Re: (Score:3)
I think they may need to work on that a bit. I just tested three samples of my writing, all in a similar style, and got three different authors.
Re: (Score:2)
That's odd. I saw his link earlier today and made a mental note to check it out later (later being now).
I took three blog entries of mine and sent them through iwl.com and got three different authors. Came back to slashdot to post my findings and found out that someone else had already done the exact same thing (same sample number too) and posted it to the web.
You're not an alternate personality of mine are you? Are you my personality disorder (or am I yours?)
Re:deduce my identity (Score:2)
I was going to guess either Tom Womack or Baldrson, but I'm out of time and I don't think I'm right.
Re: (Score:2)
I'm gonna guess eldavojohn based on the writing style, far on the "unrushed/low ADD factor" end of the spectrum.
Re: (Score:1)
Based on a google search for site:slashdot.org "anonimity" "Bonus points"
http://apple.slashdot.org/comments.pl?sid=2467532&cid=37655486 [slashdot.org]
Tool to improve your writing skills (Score:5, Interesting)
If it can identity you based on your idiosyncrasies, I suppose that means writers could use software based on these techniques to identity the idiosyncrasies in their own writing. From there, they can learn new ways to express themselves and write in a more colorful and varied manner.
Heck, it can even be a tool that teaches you to think in a more varied manner.
Re: (Score:3)
If it can identify the idiosyncrasies in your writing, it can identify them in others'. I wonder if it can alter your "anonymous" controversial rant to look like that other.
Re: (Score:3)
Re: (Score:2)
I just write like Al Swearengen talks, myself.
Software can rewrite your texts (Score:2)
If it's all automated, writers don't need to learn new ways to express themselves. Software can do that for them!
Re:Tool to improve your writing skills - exists (Score:5, Interesting)
Re: (Score:1)
Thanks for pointers to those programs (I'll try the free stylewriter soon) - I've wondered about programs like that ever since I tried my hand at sentence generators back in high school.
Have you tried these products and/or do you work in or have you benefited from that area of software development? (it's been too long - how do I PM you on slashdot??)
8-PP
Unfortunately it doesn't.... (Score:2)
....work on those in government creating fake identities for spying and provoking things that help them justify their pointless jobs.
Looking at the percentages... Hmmmmm...
Discovered yet again (Score:1)
How many times is this going to be 'discovered' and featured on the front page of Slashdot? It's old news. We get it. No need to publish another story on the topic, there's been one a quarter or so for years.
change your posting style.. (Score:5, Interesting)
if your stupid enough to not change your posting style when trolling, your own bad.
Call Anonymous cowards (Score:1)
Nothing new really (Score:2)
Even going back to the day when forums on the internet were email lists, there would always be some immature person who was a regular who switched aliases.
It would be so obvious from their pattern of writing that their new alias would seem as effective as a disguise as merely putting dark glasses on.
Defend yourself ... (Score:2)
I don't always... (Score:1)
I don't always attempt to identify an author, but when I do, I find the right author 80% of the time.
meh (Score:3)
Re: (Score:1)
Meh, doesn't work that well, because not everybody can speak Spanish.
Re:meh (Score:4, Funny)
Re: (Score:2)
Meh, doesn't work that well, because not everybody can speak Spanish.
Apparently not everyone can hablar francés, either.
Zooooooom!
(German for "Wooooosh!")
Re: (Score:2)
Re: (Score:2)
Google speaks Spanish for me.
Re: (Score:2)
that was japanese you stupid idiot
This is why (Score:4, Funny)
Remember, kids, practice redundant privacy measures to ensure you will never be exposed.
k... (Score:1)
So... (Score:2)
floxinoxinihilipilification (Score:4, Funny)
Damn! I'll have to stop using floxinoxinihilipilification so much in my anonymous posts or people will know it's me!
Using the logic proposed in the article- can we assume that all the anonymous cowards using "the other f word" are all Samuel L Jackson?
Fists? (Score:1)
Just as individual telegraphers could be identified by other telegraphers from their 'fists,'
...anonymous posters can be identified by their Frists? [google.com]
Tell us something we don't know already (Score:1)
Tell us something new. This is how they caught Kaczinsky the Unibomber. Analysis of word choice word frequency sentence structure can and will identify you And? Identifying a single person from their many anonymous messages online leads you to back to anonymous.
Aside from that it's easy enough to alter your writing to fool the analysis if you want to. Please tell us something new that every single person on Slashdot doesn't already know.
i help admin a small town web discussion site (Score:3)
around 2,000 users
#1. the smaller the town , the pettier the politics
#2. there is one user we keep banning, and they keep coming back under a new name, and you can always tell with 100% accuracy that it is the same person, based on sentence cadence and agenda, and overall personality and attitude
Re: (Score:2)
Re: (Score:2)
i thought that was called the rubber room
it's funny to watch users who are not aware of this maneuver yet screaming about why people are not replying to them
alas, not possible under the current system (ning)
right author 100% of the time* - Gravatar (Score:3)
I've mentioned this before, but it's worth repeating as more and more services no longer use their own identity systems, relying instead on Gravatar, or doing away with their own comments system by relying on Disqus (which uses Gravatar).
In the case of sites using Gravatar incorrectly*, which is pretty much all of them, 'anonymous' posts still have their Gravatar ID attached - which is just an MD5 of the person's e-mail address. All you then need to do is find that same MD5 on another site where the author opted not to post anonymously.
The main reason this ties into the story at hand is in getting reference material together. With e.g. Disqus, you can be reasonably assured (unless account sharing occurs) that the anonymous post with MD5 X on site A is authored by the same person as that of the anonymous post with MD5 X on site B, and you can include both in the pool of reference material.
( This also means there are issues with anonymity even if the author always posts 'anonymous'. )
* The worst part of this is the website owners. Aside from letting anonymous posts still grab their results from Gravatar (even if you don't have a Gravatar 'account', the e-mail address you use will be the MD5 in the HTML), some sites implement Gravatar as an afterthought. You could have been posting to a site for years behind a pseudonym, knowing that you're reasonably anonymous - and then find your pseudonym, and all the posts made, linked to other posts at other sites because the website owner decided to use Gravatar to display users' avatars of choice, using the e-mail address in their account.
Gravatar is a useful service, especially in that the website can save some bandwidth, and the users who do want it can just update a single avatar and have that immediately be used on any site that uses the service.
But I implore webmasters to consider seriously the ramifications of using Gravatar or Disqus, and at least:
1. Disallow Gravatar on posts, profiles, etc. that were created before your implementation of Gravatar.
2. Create an opt-in system for the use of Gravater, per-profile.
3. Disable the Gravatar code when the post author has indicated that they want to post anonymously.
4. If implementing Disqus, make clear that its service may not adhere to your site's own privacy policies, and posting anonymously is a faÃade.
Much the same applies to other login, profile, and comment consolidation/aggregation/syndication systems (such as facebook's), but especially in the case of Gravatar, which requires no user interaction such as a login or existing valid login state), it is all too easy to think only of the benefits.
The circumvention? Plagarism! (Score:3)
This is called stylometry (Score:3)
Like fingerprints? (Score:1)
Self promo (Score:1)
Re: (Score:2)
Would that be enough, however? I fear, though, that this might be the new handwriting analysis craze. Still, each person has quirks to their writing to some degree. For one, I think my usual quirk stands out quite well, yeah.
I exaggerated it for the sake of making it obvious. I wonder how well this system at picking up things like this. Meaning, if I started talking like this:
Yo dawg, the meta-battle between anons and the man is heating up. Cool story bro, but we need fight this now. Our privacy is i
Re: (Score:3)
Re: (Score:3)
Arright la, I warant eh Scouse filta by tomorra.
Re: (Score:2)
I am Brian, and so is my wife.
Re: (Score:2)
Hi Steve!
Re: (Score:1)
HB Gary had poor security.
An SQL injection into their websites custom build CMS which didn't salt any hashed passwords. One of the recovered passwords was also used for an email account (a lesson in not reusing the same password). Anonymous then logged into the email account and sent an email to the system administrator asking for the servers root password, in plain text email I might add. So the servers root password was emailed back to the attacker in the compromised account, and the rest is history.
That'