Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Privacy Government The Courts United States News Science Technology

FBI Fights Testing For False DNA Matches 411

Statesman writes "The Los Angeles Times reports that an Arizona crime lab technician found two felons with remarkably similar genetic profiles, so similar that they would ordinarily be accepted in court as a match, but one felon was black and the other white. The FBI estimated the odds of unrelated people sharing those genetic markers to be as remote as 1 in 113 billion. Dozens of similar matches have been found, and these findings raise questions about the accuracy of the FBI's DNA statistics. Scientists and legal experts want to test the accuracy of official statistics using the nearly 6 million profiles in CODIS, the national system that includes most state and local databases. The FBI has tried to block distribution of the Arizona results and is blocking people from performing similar searches using CODIS. A legal fight is brewing over whether the nation's genetic databases ought to be opened to wider scrutiny. At stake is the credibility of the odds often cited in DNA cases, which can suggest an all but certain link between a suspect and a crime scene."
This discussion has been archived. No new comments can be posted.

FBI Fights Testing For False DNA Matches

Comments Filter:
  • by teh moges ( 875080 ) on Saturday July 19, 2008 @11:29PM (#24259011) Homepage
    I believe the problem with this is outlined in this article: http://en.wikipedia.org/wiki/Prosecutor's_fallacy [wikipedia.org]

    Have a read. It shows that you can't trust statistics when you only have half of the picture, and why it can be so dangerous to do so.
  • by Anonymous Coward on Saturday July 19, 2008 @11:53PM (#24259163)

    It's not wrong (at least not by much).

    Well, a quick calculation shows that for 122 matches out of a database of 65,000 (2 billions pairwise comparisons) and 908 matches out of a database of 220,000 (24 billion comparisons) when looking for all pairs is about 1/20,000,000.

    Of course, this does not take into account that he's looking for any 9 matches out of 13. As the article mentioned, if any of the loci do not match, an overall mismatch is called. So the probabilities further go down because you need to have only the exact 9/13 loci to be available (since 10+ matches out of 13 are rare), which is 1/715. That gets us in the ballpark of 1/100 billion of having that kind of a match.

    Maybe the numbers are slightly off, but they don't seem wrong by much.

    They should've gotten a statistician to explain the results to any judge and jury who needed to hear it rather than fighting it tooth and nails like they did. Looks like the FBI doesn't understand it properly either, which is definitely worrisome.

    There's nothing to see here, just people who don't realize how statistics apply when dealing with a large number of comparisons.

  • Birthday paradox (Score:5, Informative)

    by russotto ( 537200 ) on Sunday July 20, 2008 @12:03AM (#24259245) Journal
    The FBI says that the chance of any given person matching another unrelated person is 1 in 113 billion. They claim that the reason the Arizona lab tech found as many matches as she did ("dozens") is because she was checking the whole database (6 million entries) against itself. This is a straightforward birthday paradox issue, then. According to the Wikipedia birthday problem page, the number of collisions expected given d= 113 billion different "birthdays" and n = 6 million "people in the room" is n - d + d((d-1/d)^n). This is about 160 matches! So in fact the FBI may be right. Note that the chance of a given person matching _anyone_ in the database is about 0.0053%, which is much greater than 1 in 113 billion.
  • Re:Birthday paradox (Score:3, Informative)

    by russotto ( 537200 ) on Sunday July 20, 2008 @12:05AM (#24259259) Journal

    (yeah, I suck, forgot "plain old text")

    The FBI says that the chance of any given person matching another unrelated person is 1 in 113 billion. They claim that the reason the Arizona lab tech found as many matches as she did ("dozens") is because she was checking the whole database (6 million entries) against itself. This is a straightforward birthday paradox issue, then.

    According to the Wikipedia birthday problem page, the number of collisions expected given d= 113 billion different "birthdays" and n = 6 million "people in the room" is n - d + d((d-1/d)^n). This is about 160 matches! So in fact the FBI may be right.

    Note that the chance of a given person matching _anyone_ in the database is about 0.0053%, which is much greater than 1 in 113 billion.

  • by Anonymous Coward on Sunday July 20, 2008 @12:12AM (#24259287)

    Your math is basically sound, however they are only using a THIRTEEN "markers" to make their identification/match.

    If they used the entire thing, I suspect your math would be completely correct.
    Would you care to re-do your math using only 13 points as the profile?

  • Birthday Paradox (Score:4, Informative)

    by jberryman ( 1175517 ) on Sunday July 20, 2008 @12:22AM (#24259349)

    If I'm not mistaken, what you've described is the Birthday Paradox:

    http://en.wikipedia.org/wiki/Birthday_paradox/ [wikipedia.org]

  • by wronskyMan ( 676763 ) on Sunday July 20, 2008 @12:22AM (#24259351)
    DNA and fingerprints are useful in conjunction with other evidence, just as any other type of forensic or circumstantial evidence is. If someone passed out and died during a movie from a blow dart, for example, it would not be prudent to arrest a random person if they had tickets to the movie; similarly, if there is a particular DNA profile on the dart that 10,000 people match, those 10K should not be brought in. However, if someone had tickets to the movie && matched the DNA it would probably be a good idea to bring them in. CSI type shows are partly to blame - the average citizen on the jury trusts scientific evidence on its own far too much instead of the old detective story trio of means, motive and opportunity (forensics cannot help at all with the second).
  • by JoeMerchant ( 803320 ) on Sunday July 20, 2008 @12:24AM (#24259371)

    That math - simple as it is - is too complex to explain to the average viewer in a 30 second news byte... what the media will do is take those 159 matches and blow them into a sensational story about the possibility (not probability) that DNA nabbed the wrong guy. If they can sufficiently suppress this story, they will have a lot less jurors quoting the news byte as absolute proof that DNA evidence can't be trusted.

    Still, they should do the test - I'm not worried if there are 50, 159, or 300 "matches" - I'd like to know if there are 1500+

  • Re:well, well... (Score:3, Informative)

    by iminplaya ( 723125 ) on Sunday July 20, 2008 @12:33AM (#24259437) Journal

    Hmmmm [usdoj.gov]...

    At midyear 2007 there were 4,618 black male sentenced prisoners per 100,000 black males in the United States, compared to 1,747 Hispanic male sentenced prisoners per 100,000 Hispanic males and 773 white male sentenced prisoners per 100,000 white males.

    Almost 6 to 1. And how many people will use these numbers to justify their racist attitudes instead of realizing who's being targeted? An economic breakdown might be even more revealing.

  • Re:well, well... (Score:5, Informative)

    by rohan972 ( 880586 ) on Sunday July 20, 2008 @01:13AM (#24259619)
    Possibly significant in terms of sample size.
    If person A has a DNA profile that matches one other person in the country, it is still very strong evidence.
    If upon checking the other states there was found to be an average of one matching person per state, 50 matches, still strong evidence, but not nearly so conclusive. Would now require stronger supporting evidence to be "beyond reasonable doubt".
    If (prison population being approx 1%) there are found to be 100 matches per state, 5000 matches, then DNA becomes more useful as evidence for aquittal than for conviction, ie: non-matching still proves it wasn't you but matching doesn't prove it was you.
  • by Jah-Wren Ryel ( 80510 ) on Sunday July 20, 2008 @01:26AM (#24259665)

    Everything, I suppose, but your nic.

    Since presumably he is not a member of the government, "radical transparency" does not apply to his identity.

  • by huit ( 1285438 ) on Sunday July 20, 2008 @01:41AM (#24259721)
    Glad to see mention of the birthday paradox, it illustrates the issue nicely. I worked on a genetic mark recapture program that encountered just this effect. Initially things looked great but as the sample size increased we started encountering "shadows" (individuals that share markers at all loci sampled but aren't true matches) with greater frequency. To study large populations you need markers with significantly lower probability of identity than has been assumed in a lot of research. We often remarked how rediculous the statistics quoted by journalists and in court are.
  • Re:well, well... (Score:4, Informative)

    by jonadab ( 583620 ) on Sunday July 20, 2008 @01:44AM (#24259733) Homepage Journal
    Try a geographic breakdown. Here's a hint: it correlates *strongly* with population density. A very disproportionately high percentage of the crime occurs in the urban areas. Something like 90% of the crime, and 99% of violent crime, in the big urban areas that house about 40% of the population.
  • Re:well, well... (Score:2, Informative)

    by Anonymous Coward on Sunday July 20, 2008 @01:48AM (#24259759)

    I can see it now. Sparticus II: "We Are Sarcasticus!"

    See, that was a horrible joke.

  • Re:well, well... (Score:3, Informative)

    by KGIII ( 973947 ) <uninvolved@outlook.com> on Sunday July 20, 2008 @03:07AM (#24260029) Journal
    Find the owners of the CCA and look a bit more heavily into the privatized jails and prisons that are springing up around the country.
  • Re:I wonder... (Score:4, Informative)

    by LaskoVortex ( 1153471 ) on Sunday July 20, 2008 @03:26AM (#24260135)

    . That is to say, the chance of having marker A might be 1% and the chance of having marker B might be 5%, but the chance of having BOTH might very well be higher (or lower) than .05%.

    IANAFG (I am not a forensic geneticist) but the co-segregation of genetic markers is such a fundamental and well understood process that I would have a hard time believing that they wouldn't know and correct for the rates of their chosen set when calculating the probabilities of a matched set.

    Of course the statistics they calculate are probably based on estimates of pairwise segregation. Some higher-order effects may be at work that change the statistics relative to a basic model like independent pairwise segregation.

    For example, allele A of gene 1 and allele B of gene 2 may not segregate according to a previously measured pairwise stastistic in the presence of allele C of gene 3. Such higher-order effects may have a significant impact on the statistics but would require a *lot* of data to reveal.

  • by ShakaUVM ( 157947 ) on Sunday July 20, 2008 @04:00AM (#24260231) Homepage Journal

    >>There is a big difference between telling a lay jury "this match had a one in a 113 billion chance of occurring at random" versus "this is an event that occurs randomly on a routine basis." Non-statisticians have a hard time getting their head around the concept of correction for multiple hypothesis testing.

    To give an apocryphal quote by Mark Twain: "People use statistics the same way drunks use lampposts - for support, not illumination."

    The lack of ability to reason statistically is extremely common in America. I mean extremely common - even in grad students publishing papers on stats, or in the technologically literate crowd. I'd used to write examples of egregiously bad stats in my livejournal in papers and news reports, but gave up because it was so common.

    The DNA testing example is actually an example we studied in the Bayseian/conditional chapter of my stats textbook. It described an actual court case in LA where I got was convicted solely by DNA evidence (there was no other evidence to convict him, and he wasn't lucky enough to have an alibi) because the prosecutor confused the odds that (in this case) the odds of the match randomly matching being only one-in-a-million, and those are some pretty powerful odds. Of course, that would mean that in LA alone, there would be 6 people (on average) matching the DNA, and so the chance of the guy being guilty is actually only 1/6 or so.

    The problem I have with the DNA "this has a one in 113 billion chance of matching" is that this is an extrapolated number based on certain premises of independence between the different loci. Whereas the more we learn about DNA, the more we learn that there is a high degree of covariability, certainly enough that (as the article shows), the odds of a match are actually much much higher.

  • Re:Birthday Paradox (Score:3, Informative)

    by ya really ( 1257084 ) on Sunday July 20, 2008 @04:58AM (#24260461)
    Good ole pidgeon hole theorem, havent seen that since discrete mathematics.
  • by thogard ( 43403 ) on Sunday July 20, 2008 @05:17AM (#24260553) Homepage

    I don't think they are even using markers. I thought they were using a process that basically duplicates the DNA a massive number of times, then use gravity vs. capillary action to weigh the different chromosomes which may or may not have been through a blender 1st. They are not comparing gigabits of data to verify a DNA match.

  • by MPAB ( 1074440 ) on Sunday July 20, 2008 @05:44AM (#24260643)

    Or that's what they expect you to conclude.

    These tests are chosen so they can tell a person apart from his family, even his own twin in some cases! The "extremely close matching DNA" you mention consists of a very small portion of the subject's DNA which in most cases encodes nothing we know of.

    These tests can conclude "subject X is not the same as suspect A", but they just can say "theres's a very high probability of suspect A being the same as subject X".

    There's racial tracers in DNA that can tell how long ago your lineage forked from its branch, like in the National Geographic Global Gene Project. And that proves differences among human races.

  • Re:Birthday Paradox (Score:5, Informative)

    by John_Sauter ( 595980 ) <John_Sauter@systemeyescomputerstore.com> on Sunday July 20, 2008 @08:58AM (#24261433) Homepage

    If I'm not mistaken, what you've described is the Birthday Paradox:

    http://en.wikipedia.org/wiki/Birthday_paradox/ [wikipedia.org]

    You aren't mistaken, but the Wikipedia reference is actually Birthday problem [wikipedia.org].

  • by karlandtanya ( 601084 ) on Sunday July 20, 2008 @10:32AM (#24262087)
    Remember the "birthday problem"? "How likely do you think it is that any two people in this classroom have the same birthday?" Most of the kids take a quick look around, see ~30 people in the room, know there's 365 days in a year and think--not very likely. But there's usually a match. In a classroom full of kids, the probability that any two children have the same birthday is (we'll ignore leap year for simplicity) 1/365. We need to know the probability that none of the kids have the same birthday. The probability of there being no collisions between two kids is still 1/365--this is just a more useful wording of the criterion. So, first 2 kids, probability of a collision: 1/365 Third kid--if his b-day lands on either of the first two kids' you get a hit: 2/365 Fourth--3/365 chance of a collision. ... And, of course, if you had 366 kids in the room, the last one's a sure thing. You multiply the probabilities a series of independant events to get the probability of the whole series. If we have 30 kids and 365 days, we want to know the chances of 30 misses (no collisions) in a row. If P is the probability of something happening, then probability of NOT (something happening) is 1-P So, probability of 30 misses in a row will be 1-(1/365) * 1-(2/365) * 1-(3/365) * ... * 1-(30/365). Which is ~.2703. So, 1-.2703 tells you that if you've got 30 kids in the room you've got nearly a 3/4 shot at two of them having the same birthday. Quickly iterating through the same process in oo.o calc for an FBI database with... ...ability to recognize 113E+09 unique DNA profiles ...DNA from a million folks (no idea how many of us they really have) gives you .988 probability of collision. BTW, the general formula for the "birthday problem" is written as follows: P=d!/[(d-n)!(d^n)] Where P=probability of no collisions d=number of days in the year n=number of students in the sample
  • Forensic "science" (Score:4, Informative)

    by exp(pi*sqrt(163)) ( 613870 ) on Sunday July 20, 2008 @10:38AM (#24262135) Journal
    > The FBI estimated the odds of unrelated people sharing those genetic markers to be as remote as 1 in 113 billion

    As I've said time and time again. Forensic science is a scam. Second rate statisticians and second rate politicians team up with second rate scientists and second rate TV shows to convince the public that forensic superheroes can detect evidence of any evil crime you commit. It's just a way to keep the people under control.

  • by Alsee ( 515537 ) on Sunday July 20, 2008 @12:07PM (#24262919) Homepage

    The test would immediately give a 100% result of it as non-human DNA.
    One of the reasons for that is the fact that humans have 23 pairs of chromosomes whereas chimps and other primates have 24 pairs. We didn't "lose" a chromosome - one strand of DNA got glued on to the end of one of the other strands of DNA, so all of the same genetic information is still there.

    -

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...