Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Privacy Your Rights Online

Anonymous Cowards, Deanonymized 159

mbstone writes "Arvind Narayana writes: What if authors can be identified based on nothing but a comparison of the content they publish to other web content they have previously authored? Naryanan has a new paper to be presented at the 33rd IEEE Symposium on Security & Privacy. Just as individual telegraphers could be identified by other telegraphers from their 'fists,' Naryanan posits that an author's habitual choices of words, such as, for example, the frequency with which the author uses 'since' as opposed to 'because,' can be processed through an algorithm to identify the author's writing. Fortunately, and for now, manually altering one's writing style is effective as a countermeasure." In this exploration the algorithm's first choice was correct 20% of the time, with the poster being in the top 20 guesses 35% of the time. Not amazing, but: "We find that we can improve precision from 20% to over 80% with only a halving of recall. In plain English, what these numbers mean is: the algorithm does not always attempt to identify an author, but when it does, it finds the right author 80% of the time. Overall, it identifies 10% (half of 20%) of authors correctly, i.e., 10,000 out of the 100,000 authors in our dataset. Strong as these numbers are, it is important to keep in mind that in a real-life deanonymization attack on a specific target, it is likely that confidence can be greatly improved through methods discussed above — topic, manual inspection, etc."
This discussion has been archived. No new comments can be posted.

Anonymous Cowards, Deanonymized

Comments Filter:
  • Re:First (Score:5, Interesting)

    by FriendlyLurker ( 50431 ) on Tuesday February 21, 2012 @09:11AM (#39109413)
    This just begs a "reanonymize" browser plugin to alter one's writing style...
  • better way. (Score:5, Interesting)

    by Anonymous Coward on Tuesday February 21, 2012 @09:20AM (#39109497)

    This is, of course, not really new.

    A couple of years ago, there was some news (cannot find the link now) that some researchers tried this with a more statistical approach. As an implementation they used a compression algorithm.

    I had a try with this on a forum. Somebody posted a long story anonymously, but I suspected the author. I gathered 10 posts from 5 authors, including the suspect. Then I cut the amount of text to equal length. Subsequently I added the anonymous text to each of the 10 samples and bzipped the resulting text.

    The resulting zipped file was shortest in the case where I added the unknown text to the samples from the suspected author. The bzip algorithm apparently decided there was more similarity between the posts.

    Although this was by no means a real scientific test, I turned out to be correct and was rather pleased with the result. Seems to me such an approach could also be useful for things. Why login on /. when it can just figure out who you are based on what you have just written?

    To maintain anonimity you would just have to insert random shit into your posts.

    Bonus points for the slashdotter who can deduce my identity based on the non-randomness of this post.

  • by bigsexyjoe ( 581721 ) on Tuesday February 21, 2012 @09:22AM (#39109529)

    If it can identity you based on your idiosyncrasies, I suppose that means writers could use software based on these techniques to identity the idiosyncrasies in their own writing. From there, they can learn new ways to express themselves and write in a more colorful and varied manner.

    Heck, it can even be a tool that teaches you to think in a more varied manner.

  • by ardiri ( 245358 ) on Tuesday February 21, 2012 @09:45AM (#39109775) Homepage

    if your stupid enough to not change your posting style when trolling, your own bad.

  • Comment removed (Score:5, Interesting)

    by account_deleted ( 4530225 ) on Tuesday February 21, 2012 @09:47AM (#39109795)
    Comment removed based on user account deletion
  • Comment removed (Score:4, Interesting)

    by account_deleted ( 4530225 ) on Tuesday February 21, 2012 @09:51AM (#39109833)
    Comment removed based on user account deletion
  • Re:First (Score:5, Interesting)

    by lightknight ( 213164 ) on Tuesday February 21, 2012 @10:23AM (#39110161) Homepage

    And easily-defeated. One of the projects of my senior class at university was the building of software to defeat that kind of detection. It was crafted primarily so dissidents in foreign countries could speak without fear, by analyzing the author's writing patterns, and offering solutions to shift the writing to a different style.

  • by rarrar ( 671411 ) on Tuesday February 21, 2012 @10:54AM (#39110557)
    Schools already use programs like "White Smoke" and http://www.whitesmoke.com/ [whitesmoke.com] and "Style Writer" http://www.stylewriter-usa.com/ [stylewriter-usa.com] to identify grammar errors and stylistic errors, and suggest corrections. These programs are able to identify active and passive voice, clarity and readability of writing, ambiguous words, gender specific words, cliches, and more. I'm not sure the use of such software is such a great idea. I guess it's OK as long as a teacher reviews the results. Then again, if the teacher doesn't do as good a job as the program does...

New York... when civilization falls apart, remember, we were way ahead of you. - David Letterman

Working...