Forgot your password?
typodupeerror
Privacy Security Your Rights Online

Linguistics Identifies Anonymous Users 215

Posted by Soulskill
from the that's-why-i-run-my-emails-through-google-translate-a-few-times dept.
mask.of.sanity writes "Researchers have examined writing styles to identify previously anonymous carders and hackers operating on underground forums. Up to 80 percent of users who wrote at least 5000 words across their posts could be identified using linguistic techniques. Techniques such as stylometric analysis were used to track users who posted across different forums, and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks."
This discussion has been archived. No new comments can be posted.

Linguistics Identifies Anonymous Users

Comments Filter:
  • I worked for a smallish (but not incredibly tiny, maybe 100 employees) company and wrote a letter to the CEO once. We'd been castigated by someone who'd taken over the local office because the company was doing poorly. A number of austerity measures were implemented. I did not find those to be that annoying because I realized it was either that or not have a job. But the castigation didn't sit well with me. We were in trouble because of the decisions of a few bad managers, not the behavior of average employees.

    So I wrote a letter about it. He stripped my name off and presented it in an executive meeting to all the people directly under him. He asked "Why am I getting letters like this?". Everybody who worked in my office immediately knew who it was. I had a distinctive writing voice, and a strong reputation.

    It did not lead to me being fired. I was actually highly respected there. It led to me being encouraged to have an honest sit-down talk with the new manager for our division (the guy who'd made the speech I wasn't happy about). I think we both came away from that meeting a lot happier about the other.

    But that was a strong lesson to me. If I ever really want to be anonymous I'm going to have to purposely work on adopting a completely different writing style. And I will have to keep a wall up between styles and never 'slip'.

  • This is so bad I don't know where to begin. There is nothing, ever, that excuses this. For every zodiac crazy serial killer or copyright scofflaw they try to apply this to (and fail) there will be thousands and thousands of people that will be persecuted by organizations and governments for expressing their opinions. While this won't have a big effect in the West for half a generation, oppressive governments are going to be all over this.

    And then, in ten or fifteen years, the youth will have grown with this technology and become accustomed to it...accepting it. Just like facebook has been accepted.

    I'd move to Mars when it's possible but some bureaucrat will analyze everything I've ever written on the interwebz (and I've been mostly not stupid about shit I've written online since 1995 or so) and make some arbitrary decision about how I'm not acceptable because I'm not a huge fan of authority or some such crap.

    Way to go humanity.

  • google translate (Score:5, Interesting)

    by sl149q (1537343) on Wednesday January 09, 2013 @03:03AM (#42529321)

    One way to change a bunch of the stylistic queues would be to convert your message to another language and back using Google Translate. Depending on the intermediate language(s) and possibly using different translators should neutralize some things.

  • I've thought about that. That's an interesting and tricky problem. Though, if there's a program that can detect it, that means the patterns are codified well enough that you can write a program to obscure them. The problem is, what about the program that detects these patterns that you don't know the implementation of? Will you actually be fooling it?

    Of course, you have the same problem if you adopt a different writing style. Is it different enough? Is something essential slipping through?

    You could use both techniques. Have a program assist you in avoiding the use of certain words when using one voice and the use of others when using a different voice.

  • by girlinatrainingbra (2738457) on Wednesday January 09, 2013 @03:55AM (#42529553)
    Sock puppet accounts are also apparent from these linguistic tics. Sometimes, resorting to a particular analogy or getting hot-tempered at a specific topic or a certain kind of point of view can also give away the identity of the author. So maybe limit oneself to 5000/144 = 34 tweets per tweeter account so that you can't be figured out. And writing style and favorite kinds of rant was also how the Unabomber [wikipedia.org] was found out: his family members recognized his particular pet peeves and rants and writing patterns and sent their suspicions in to the F.B. I.
  • by toutankh (1544253) on Wednesday January 09, 2013 @06:05AM (#42530209)

    After reading TFA I cannot find any convincing experimental validation. I see a lot of "can" and conditional tense (maybe that's the author's style), but nothing on the validation of the approach. Where is the experimental data, including the number of anonymous users correctly and incorrectly identified on forums?

  • by Will.Woodhull (1038600) <wwoodhull@gmail.com> on Wednesday January 09, 2013 @10:04AM (#42531979) Homepage Journal

    I used to post anonymously much more often, when I had a job with a guvmint agency and a young famly to protect. I do not bother with that much any more. I am not invulnerable, but for the most part I know that I look like too small a fish to be worth going after.

    That said, I still occasionally post anonymously when I want to antagonize the astroturfers, Scientology nuts, etc. Especially on slashdot if I am concerned that my post might damage my karma.

    Interesting things to do when posting anonymously:

    Use a thesaurus to choose synonyms you would not ordinarily use.

    L33t 5p33k

    Write like Hemmingway. Keep all sentences short. Sentences that do not have subordinate clawses do not have much style to analyse.

    Use creative misspellings. "claws" for "clause", etc.

    Use Google Translate to do a multilingual hash: translate your work into Russian, then the Russian version back to English. "The spirit is willing but the flesh is weak" becomes "The wine is passable but the meat has gone bad."

    Ideally, Anonymous will develop a set of tools that will rewrite any text into one of half a dozen different styles. Let the authorities chase after these six fictional characters.

Unix is the worst operating system; except for all others. -- Berry Kercheval

Working...