Forgot your password?
typodupeerror
Privacy Security Your Rights Online

Linguistics Identifies Anonymous Users 215

Posted by Soulskill
from the that's-why-i-run-my-emails-through-google-translate-a-few-times dept.
mask.of.sanity writes "Researchers have examined writing styles to identify previously anonymous carders and hackers operating on underground forums. Up to 80 percent of users who wrote at least 5000 words across their posts could be identified using linguistic techniques. Techniques such as stylometric analysis were used to track users who posted across different forums, and could even be used to unveil authors of thesis papers or blogs who had taken to underground networks."
This discussion has been archived. No new comments can be posted.

Linguistics Identifies Anonymous Users

Comments Filter:
  • by kawabago (551139) on Wednesday January 09, 2013 @03:40AM (#42529203)
    I'd be rather surprised if someone else couldn't.
  • Re:College essays (Score:4, Insightful)

    by ForgedArtificer (1777038) on Wednesday January 09, 2013 @04:35AM (#42529457) Homepage

    Actually, it's the exact opposite.

    Anti-plagiarism software searches for the same content with completely different styles.

    Writer identification involves searching for the same style amongst completely different content.

  • by Anonymous Coward on Wednesday January 09, 2013 @06:49AM (#42530143)

    And Google (a.k.a "The Evil Empire" TM) will have a cached copy of the original with the IP address you posted from. In other words you'll also need to go through the magic 7 proxies !

  • by Anonymous Coward on Wednesday January 09, 2013 @10:02AM (#42531315)

    I think a bigger threat to geeks in business are when they approach such situations without due caution. If you make a claim, you must be prepared to back it up to everyone that could be interested. Real concrete evidence. References. Citations. Etc.

    And that IS approaching the situation without due caution. Geeks think that having real concrete evidence means that other people must believe you. Real world people are not like that, especially the political minded ones. Evidence be damned, political minded people play power games without regard to reality, all the way until the company bankrupts, then they play their game elsewhere.

    Approaching with due caution means you must first prepare by finding someone more powerful to back you up, and be ready to find another job even so.

    The OP survived the episode because he implicitly have the CEO's backing, as the CEO challenged the managers (i.e. already publicly shown that he agreed there was a problem with some manager). Had the CEO simply quietly sent a copy of the letter out to the managers and told them to "deal with it", the OP would likely have been fired or forced to leave.

  • by Hotawa Hawk-eye (976755) on Wednesday January 09, 2013 @12:41PM (#42533373)

    Nothing, as long as you have a large enough corpus of the framee's writing. If the framee is your friend, this probably isn't a problem. If they're a public figure, maybe not a problem (depending on how much editing and PRing their written statements undergo before they are released.) If they're $RANDOM_PASSERBY, not so easy.

    I think a more common usage would be to tweak your own writing just so it doesn't sound like you. Write something you don't want identified as your (the test sample), check it against a corpus of your own written work. If it detects as your work, rough up the test sample until it doesn't. This would be an easier problem than the framing case since you're not trying to make it look like a specific other person's work, you're trying to make it look like it's ANYONE else's (you don't really care whose) work.

COBOL is for morons. -- E.W. Dijkstra

Working...