How Statistics Can Foul the Meaning of DNA Evidence 215
azoblue writes with a piece in New Scientist that might make you rethink the concept of "statistical certainty." As the article puts it, "even when analysts agree that someone could be a match for a piece of DNA evidence, the statistical weight assigned to that match can vary enormously, even by orders of magnitude." Azoblue writes: "For instance, in one man's trial the DNA evidence statistic ranged from 1/95,000 to 1/13, depending on the different weighing methods used by the defense and the prosecution."
Damn Lies and Statistics! (Score:2, Interesting)
Re:Damn Lies and Statistics! (Score:3, Interesting)
DNA evidence is the new fingerprint. News at 11.
Juries (Score:2, Interesting)
People don't understand statistics (Score:4, Interesting)
Let's say you have evidence that matches 1 in a thousand people. You search through your database of all 1000 suspects and you get a single match. Did he do it? Logically you'd expect this to mean you can be 99.9% sure. You then search through the database of a million random people. You get 1000 matches. Does this mean there's only a 0.1% chance that your original suspect was guilty? Well, maybe there's some other compelling evidence that makes it most likely that one of those 1000 people were the culprits. But you have 10000 outliers. They're each a tenth as likely to have committed the crime. You get 10 matches. So, once again we're at the 50% probability of guilt, or something in that ballpark.
I'm sure this is a somewhat different example than that given in the article but that's not the point. The point is that is there a 99.9% probability, a 0.1% probability, a 50% probability or some other probability of guilt? Or am I just trying to confuse you by throwing numbers at you?
Several orders of magnitude? Not quite (Score:2, Interesting)
When expert A says he's certain of a match to "1 in a billion" he's really saying he's certain to 0.999999999. When expert A says he's certain to 1 in a million that's certain to 0.999999.
Compare this to the "not so far apart" difference between expert A saying "he's 1 in 10" and expert B saying "he's 1 in 5." The difference between 0.9 and 0.8 certainty is a lot greater than the difference in certainty in the first example.
By the way, if I'm on a jury, I'm interested in "who else could've done it" not raw numbers. If two people leave the crime scene and blood is a "certain to 1 in 5" match to the defendant, that is, there's a 20% chance of a mistake, and the only other person who was at the crime scene has been ruled out, the only way I'll acquit is if the defendant either makes a very very strong case he didn't do it or provides some explanation for the evidence that doesn't require either of the initial suspects to be guilty.
In more practical terms, if you can raise the odds of certainty high enough that it's implausible that two people within 100 miles of the crime scene at the time of the crime are a match, and you make a very strong claim that the DNA sample is a result of the criminal being there at the time of the crime, the defense is going to have to work very hard to get me to acquit.
Re:It's fine for saying "it's somebody else". (Score:3, Interesting)
IMHO DNA evidence is decisive for the defense
DA's these days refuse to accept that. For instance, down here in Texas we had a guy convicted of raping a woman. The woman claimed two guys raped her. Two sets of male DNA were recovered. The technician lied^Wmistakenly testified on the stand that the guy matched one set. One MASSIVE scandal later, his DNA was retested and didn't match either set of DNA.
That should be it, right? Well, the DA spent quite a lot of time fighting the release, insisting that his that this guy was one of the rapists and wore a condom and the woman couldn't count to three. Of course, I'm sure the fact that we end up paying people who get imprisoned because the government fucks up had no bearing at all on the government's desire to convince everyone they didn't fuck up.
And if the people are relatives? (Score:5, Interesting)
In this case, however, there were many people present at the discovery of the object from which the DNA was taken for analysis. As it happens, several of these people were relatives (brother, mother) of the person the prosecution were trying to persuade us was the person that possessed (in legal terms) the object.
The question that I kept hoping the defense attorney would ask was "what are the probabilities of an erroneous match if the people are relatives, not just two random people off the street"? Unfortunately, he didn't.
As it happened, there were so many other peculiarities in this case as well as some pretty bizarre testimony from prosecution witnesses that we voted to acquit without making much of the DNA evidence.
Re:This happened to me ... (Score:5, Interesting)
...we couldn't rule out the possibility that the cops got the wrong guy, so we found him not guilty. If I had to take a bet, I'd say he did it, but I wouldn't bet his life on it.
I'm glad you think the way that you do. Too many people would call him guilty if they figured there was a 55% chance of it being him. Hmm, someday someone will have to draw a line in the sand as to what odds constitute reasonable doubt. If a trial could be conducted in a completely unbiased, Bayesian way and a probability of guilt were established, what number would be needed to convict?
Re:It's fine for saying "it's somebody else". (Score:2, Interesting)
Simply, DNA evidence is by nature exclusionary. The scientifically correct result of a DNA test is excluded or not-excluded.
Exactly. As I understand the actual DNA testing process (please, Geneticists and Biologists, correct me if I'm wrong):
1) Take or obtain the sample.
2) Extract the DNA by getting rid of all the pesky pieces of cells that aren't DNA, hope there isn't too much other crap with DNA inside the sample (bacteria, virus, dog poo, etc.).
3) Put in some Xerox machines and raw materials. Make lots and lots of copies. Work off of the assumption that the copy process is perfect in every case. Make more copies of the copies.
4) After you have the proper volume of real and copied DNA, divide your sample into the number of different "LOCI" you want to test.
LOCI is a scientific word for specific sequences of "non-coding" DNA that appears between the genes we have identified.
Non-coding as used above, doesn't really necessarily mean non-coding, it may mean it does something we haven't figured out, yet. - Nobody is really positive
5) Insert "scissors" that identify and cut strands of DNA anytime the particular sequence is found (ATGATGATGATG=snip snip, for example).
6) Now that the particular sample has been thoroughly chopped into little pieces that have no resemblance to the original chromosomes in the sample, divide the hunks of DNA by size and represent the occurrences of varying lengths of the chunks of DNA (this used to be done by slurping up the sample up with absorbent paper if memory serves - I think there are machines that can do this as well).
7) Now that we have our visual representation of the sample, chopped apart using different scissor patterns, without ever having actually looked at genes, chromosomes, and non-coding sections of chromosome strands between the genes we have identified, compare to a sample obtained from our suspect and convict or maybe acquit.
NOTE: At no time did we actually compare the DNA strands, genes, or chromosomes between the suspect and the sample. Just chopped them up and looked for patterns in the lengths of what remained after chopping.
That about right for all you learned guys?
Re:Whaa? (Score:3, Interesting)
Re:This happened to me ... (Score:3, Interesting)
I'm glad you think the way that you do.
What I was surprised at was the unanimity of the jury in the case: *everyone* thought that while the defendant was probably a gangster, the prosecution didn't meet the burden of proof. The jury had everything from suburban housewives to college professors (me) to retired black civil servants to a young hispanic man with gold chains and obvious 'hood experience, and everybody came to the same conclusion with the same rationale. Deliberation took about half an hour, mostly because we wanted to finish our pizza before rendering a verdict.
----------------
PS: Derailing my own thread here, but did you just use "unbiased" and "Bayesian" in the same phrase? The whole *point* of Bayesian analysis is that the data is biased by your prior assumptions. In a good way, but still.