Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
The Courts Government Privacy News Science

Writing Style Fingerprint Tool Easily Fooled 96

Urchin writes "Some of the techniques used by literary detectives and courts of law to identify the authorship of text are easily fooled, say US researchers. They found that non-professional writers could hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy. Stylometric methods have been used in a number of high-profile legal cases in recent decades, including the 'Unabomber' trial. 'We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."
This discussion has been archived. No new comments can be posted.

Writing Style Fingerprint Tool Easily Fooled

Comments Filter:
  • No surprise (Score:5, Interesting)

    by AmiMoJo ( 196126 ) on Thursday August 20, 2009 @05:33AM (#29130681) Homepage Journal

    This should not really come as a surprise to anyone. Like all evidence that has to be interpreted, the interpretation can be flawed.

    Shows like CSI have computers getting an exact match on fingerprints and DNA, but the real world is not like that. Fingerprint matching is entirely subjective and the print recovered from a crime scene is rarely a nice clean one like they show on TV. DNA often has to be manipulated before a match can be made (due to the sample found at the scene being too small or of poor quality) and even then it often matches more than one person.

    Even when you do get a match, it's not proof that someone was at a specific place because DNA and fingerprints can easily be transferred. Someone broke in to my car a few years ago and despite there being fingerprints the police decided not to prosecute because they were on the outside of the car and the accused could just claim he lent on it on his way home from the pub.

    There have been a few cases where fingerprint and DNA evidence have been challenged in the UK courts and shown to be unreliable, with innocent people spending years in jail before being cleared. Yet, the police seem to have started asking for everyone in the area of a crime to "volunteer" their DNA. Presumably if you don't "volunteer" you become a suspect.

    The idea that handwriting is any more unique than those two and at all reliable is laughable.

  • by KibibyteBrain ( 1455987 ) on Thursday August 20, 2009 @05:43AM (#29130719)
    Again, thats why its clear that writing analysis is only a positive test. If steps are taken to actively change the style of writing, of course it will fail. It is something like saying an audio recording of someone's voice in a phone call is invalid, because it is possible to speak in a different voice. While true, this doesn't significantly weaken the positive test value.
  • by hansraj ( 458504 ) on Thursday August 20, 2009 @05:48AM (#29130737)

    What exactly is the "Cormac McCarthy style"? The article doesn't mention it all. I even skimmed through the paper and all it does it quote a paragraph from some work of Cormac McCarthy.

    I can't figure out what his style exactly is, and I certainly would not be able to fake it as the participants were supposed to. And the participants were supposed to not be literary geniuses.

  • by Lundse ( 1036754 ) on Thursday August 20, 2009 @05:50AM (#29130745)
    If you can describe something in enough detail to put it in a certain category (X writes likes this), then you can also imitate that category from that same description (I will now write like this in order to seem like X).

    I do not really see how you would ever expect different.
  • by Moraelin ( 679338 ) on Thursday August 20, 2009 @06:11AM (#29130845) Journal

    Yes, but the problem is this:

    1. It's not just that it's possible to fake not being myself, it's also that I can pretty much frame someone else. E.g., given enough messages written by KibibyteBrain (which just clicking on the user name or id will give me a list of), it's trivial to do a stylistical analysis on those and not just get an idea of how to write in the same style, but run the same analysis on the result and refine it until the match is outstanding.

    2. From what I understand, the people in this test fooled it by merely being told to write in the style of someone else, without the help of any analysis tools, and still fooled it majorly. That's some pretty damn fragile "evidence" if anyone asks me. It's something Joe Sixpack can do by himself. Add some tools and it can only get crappier.

    Even such idioms as you mention, are trivial to notice even without any tools. E.g., with only a little correspondence with another team here and reading some of their docs, I can tell that they use "solution" instead of "application".

    3. While it can be handwaved as "eh, nobody said it's perfect", some people do seem to take it as less fallible than it really is. Even you just called it "This is reasonable *evidence* of authorship, where of course evidence != proof." And that's the whole point. Something that can be fooled by almost any Joe Sixpack without any tools or much effort, isn't reasonable evidence at all.

    We allow evidence like handwriting, signatures, fingerprints, or DNA because they're supposedly very very hard to fake well. Ok, so DNA turned fakable as well, but you need a fair bit of expensive lab equipment and knowledge. It's something a biology prof at a medical college could probably do, but not something Joey Three-fingers the small time smuggler would even know where to start if he wants to plant someone else's fake blood at his latest shootout scene. Or fingerprints turned out easy to fake for the purpose of fooling a fingerprint reader, but it's still very very hard to transfer to an object in a way that looks genuine.

    But here we have something that untrained people fooled by just being told to try. I'm sorry, but for me then it shouldn't be evidence at all.

  • Re:Did you RTFA? (Score:3, Interesting)

    by k.a.f. ( 168896 ) on Thursday August 20, 2009 @08:35AM (#29131549)

    No, but they knew they were being analyzed and for what. It's trivial to change my style (well, maybe not in English, I don't tend to have the word pool to draw from) and become someone else. If I know in advance that my writing would be used to find me.

    You can, probably, given time and persistance, sift through the thousands and millions of board messages posted everywhere on the internet and find out who I am in other boards. I didn't try to hide my identity against comparison of writing styles.

    I could see this working if applied to notes and texts written by someone who didn't have any reason to assume it would become the subject of an investigation. I'd deem it utterly worthless, though, when applied to ransom notes and the like.

    That's what I meant, sorry: even a computer program could outwit such analyses. Given the current state of automatic language analysis (Disclaimer: IAA computational linguist), I consider it obvious that a determined person can fool the discriminators enough to appear as someone else.

  • by Jason Levine ( 196982 ) on Thursday August 20, 2009 @08:42AM (#29131591) Homepage

    I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors. For example, when we purchased our house I had to sign my name to a dozen or more papers. The first signature looked "normal" but the later signatures were glorified scribbles. If I needed to sign a check last and just scribbled my signature on the back, would the bank (not privy to my signature's declining quality in the previous paperwork) be able to tell that it wasn't a bad fake?

  • by Anonymous Coward on Thursday August 20, 2009 @09:20AM (#29131935)

    Ummm Not Fair 20 years ago the exam board just labeled that 'Bad Grammmar' and failed me.

  • by neo ( 4625 ) on Thursday August 20, 2009 @09:53AM (#29132287)

    While you can attempt to write in someone else's style, you're going to run into problems duplicating it strongly enough for a stylometric analysis to implicate them. Even if you lifted exact phrases from previous works you will invariably need to come up with original words, phrases, and sentence structures to fill the gaps where the original author has not written. These should be enough put reasonable doubt as to the authorship of the faked text.

    More over, if it's identified as a fake, by eliminating the material that was copied from previous styles it's likely that your identity may be revealed from the pieces that you inserted to fill gaps. Obviously the longer the piece, the more likely this is.

    The technique of hiding one's own identity is a matter of using the same techniques in stylometrics to identify phrases, words, and structures that would identify you, and then changing these until they no longer give an indication of your identity.

    Attempting to creating a work that duplicates someone else's stylometric signature would be fairly obvious to linguists.

  • by Anonymous Coward on Thursday August 20, 2009 @10:45PM (#29142465)

    Over in Japan, we use Hanko, which are simply ink stamps.
    While signatures can be forged, Hanko is susceptible to theft AND duplication from the stamp.
    I think signatures work on the assumption that signatures are like "artifacts" of one's personality - pretty much like statistics that describe
    the character of a population. The same goes for stylometrics.
    These, like MD5, are good for match identification, but not for authentication.
    Using stylometrics as evidence IMHO is a misuse of technology.

It is easier to write an incorrect program than understand a correct one.

Working...