Join 3,564 readers in helping fund MetaFilter (Hide)


How reliable is DNA in identifying suspects?
July 20, 2008 12:18 PM   Subscribe

A discovery leads to questions about whether the odds of people sharing genetic profiles are sometimes higher than portrayed. Calling the finding meaningless, the FBI has sought to block such inquiry.
posted by finite (30 comments total) 7 users marked this as a favorite

 
HowStuffWorks.com: How DNA Evidence Works
posted by finite at 12:19 PM on July 20, 2008


In other news: Nobody understands hash collisions, apparently.
posted by odinsdream at 12:31 PM on July 20, 2008 [6 favorites]


DOJ DNA Policybook: Use of DNA to solve Crimes.
posted by acro at 12:35 PM on July 20, 2008


An interesting and well researched story. I wish there were more talk about why the FBI is so dead-set against this, though. I mean, besides being control freaks.
posted by hattifattener at 12:41 PM on July 20, 2008


Unfortunately courtroom 'science' != actual science. Where this is the worst is probably courtroom psychology where bullshit like drawing analysis and whatnot can be used as evidence.
posted by delmoi at 12:44 PM on July 20, 2008 [1 favorite]


Birthday paradox
posted by yoyo_nyc at 12:46 PM on July 20, 2008 [3 favorites]


The FBI laboratory, which administers the national DNA database system, tried to stop distribution of Troyer's results and began an aggressive behind-the-scenes campaign to block similar searches elsewhere, even those ordered by courts, a Times investigation found.

This is possibly the most pernicious legacy of the Bush administration: government agencies trying to block hard science research when the results would cause problems for them. If this attitude survives Bush then it's hard to see how it won't eventually result in the destruction of research in the U.S.
posted by JHarris at 1:02 PM on July 20, 2008 [5 favorites]


This is possibly the most pernicious legacy of the Bush administration: government agencies trying to block hard science research when the results would cause problems for them.

You think that's unique to this administration? (You must be young.)
posted by Class Goat at 1:07 PM on July 20, 2008


Class Goat, of course it's not new, and it's probably to some degree part of every relationship between those with power and those who hold truths inconvenient for power.

But it is definitely arguable that the Bush administration is an outlier from the pack in their unrelenting habits here.

Whether or not that means the Bush administration and resultant executive culture is behind the FBI's playing weird about this, I can't say, but it's not a particularly big stretch considering some of the observed behavior in other parts of the executive machinery that can be traced to Bush and co.
posted by wildblueyonder at 1:16 PM on July 20, 2008 [1 favorite]


Google isn't helping me here: there was a story about 5-10 years ago about a dna search turning up a suspect 20-30 years after the crime (rape/murder). The guy claimed to be in the army at the time, 1000's of miles away, but the records had been lost. The "billions to one" certaintly was sited at the time, I wonder how his case turned out.
posted by 445supermag at 1:17 PM on July 20, 2008 [1 favorite]


From the national database, they had a 9/13 match in every 27 million comparisons. That's before correcting for the fact that there are related people in the database, probably a few duplicates, and that they checked *any* 9/13. If you correct only for the 9/13, the number will shoot up to somewhere more than one match in 19 billion [there will be dependencies and missing data that make it more than this].

What's so wrong with being honest and saying 9/9 match is one in several billion? There's really nothing to hide? Of course it indicates that the probably existent policy of scanning through the database for every new sample is misguided, but I'd really hope that they're not doing that.

math: 220*220/2/903 *1000*1000 is 27M comparisons per match.
the above times 13 chose 9 is 19B comparisons per match.
posted by a robot made out of meat at 1:23 PM on July 20, 2008


There is no mafia.
posted by Artw at 1:26 PM on July 20, 2008


Just one non-matching locus would be sufficient for exclusion as suspect. There is some wiggle room for the attorneys, such as being absolutely sure that none of the parties involved has had a blood transfusion or a bone marrow engraftment, beside ruling out any degree of consanguineity.
posted by francesca too at 1:37 PM on July 20, 2008


"There is no mafia."

It's all a Nemesis plot!
posted by ZachsMind at 1:48 PM on July 20, 2008


This is also why many anti-terrorism measures fail, see the failures of basic bayesian inference.

(Population * False Positive Rate) / (One Single DNA). The numerator is much more likely to win. FAIL.
posted by amuseDetachment at 1:58 PM on July 20, 2008 [1 favorite]


The men matched at nine of the 13 locations on chromosomes, or loci, commonly used to distinguish people...But the mug shots of the two felons suggested that they were not related...One was black, the other white.

Does anybody know if race is one of the loci? Can race/gender be determined from say a blood sample?

A DNA match doesn't necessarily mean the guy is guilty either. Just because one's semen is proved to be present in a murdered rape victim, doesn't imply the guy killed her or raped her. Other information needs to exist to suggest or prove he committed a crime. On the other hand, I've seen where the only physical evidence was a semen sample which did not match the defendant's (and almost no circumstantial evidence). He was convicted anyway. DNA is just one piece of the puzzle and even with these new findings, the odds that they are accurate are still reasonably high.
posted by sluglicker at 2:19 PM on July 20, 2008


If you work in the Justice Department, you should not be hiding the truth.
posted by popechunk at 2:25 PM on July 20, 2008 [1 favorite]


I used to eat breakfast at The Rainbow Diner in Chicago. A waitress got pissed off and threw my breakfast at me. She missed, but the hash went all over the table. So yeah, I know what you mean.
posted by sluglicker at 2:27 PM on July 20, 2008 [2 favorites]


In other news: Nobody understands hash collisions, apparently.

Also, 'use these 13 little portions of the data' isn't the best hash function to use in my opinion.
posted by TheOnlyCoolTim at 3:06 PM on July 20, 2008 [1 favorite]


Can race/gender be determined from say a blood sample?

Gender can be determined, certainly, by looking at the sex chromosomes in the fraction of the blood sample that contains genetic material. If your sample has one X and one Y sex chromosome, the sample came from a male; two Xs suggest a female. There are chromosomal abnormalities but by and large this is how it works.

Likewise, there are sets of markers (SNPs) that can be more common in a particular ethnicity. A statistical test could say with certain likelihood some things about your heritage, by looking for a specific collection of SNPs.
posted by Blazecock Pileon at 3:15 PM on July 20, 2008 [1 favorite]


This is possibly the most pernicious legacy of the Bush administration: government agencies trying to block hard science research when the results would cause problems for them.

More than this, it's a symptom of the problem with the "justice" system in this country. Namely, they don't care about punishing those truly guilty of a crime, they care about punishing whoever they can get. Upon being told maybe they're getting the wrong guys, instead of being concerned about potential injustice, they say "STFU."
posted by TheOnlyCoolTim at 3:32 PM on July 20, 2008 [3 favorites]


CODIS: The FBI Laboratory's COmbined DNA Index System Program

The COmbined DNA Index System, CODIS, blends computer and DNA technologies into an effective tool for fighting violent crime. The current version of CODIS uses two indexes to generate investigative leads in crimes where biological evidence is recovered from the crime scene. The Convicted Offender index contains DNA profiles of individuals convicted of felony sex offenses (and other violent crimes). The Forensic index contains DNA profiles developed from crime scene evidence. CODIS utilizes computer software to automatically search these indexes for matching DNA profiles.

The word "index" in COmbined DNA Index Systems is not arbitrary. CODIS is a system of pointers; the database only contains information necessary for making matches. Profiles stored in CODIS contain a specimen identifier, the sponsoring laboratory's identifier, the initials (or name) of DNA personnel associated with the analysis, and the actual DNA characteristics. CODIS does not store criminal history information, case-related information, social security numbers or dates-of-birth. Matches made among profiles in the Forensic Index can link crime scenes together; possibly identifying serial offenders. Based on a match, police can coordinate separate investigations, and share leads developed independently. Matches made between the Forensic and Convicted Offender indexes ultimately provide investigators with the identity of the suspect(s).

CODIS also supports a Population file. The Population file is a database of anonymous DNA profiles used to determine the statistical significance of a match.


Here's a bit more about CODIS (PDF) from Promega:

The current version of CODIS software supports the storage and searching of both restriction fragment length polymorphism (RFLP) and PCR-based DNA profiles...

The tools of molecular biology now enable forensic scientists to characterize biological evidence at the DNA level. Currently, the methods available to the forensic scientist include a) RFLP typing of variable number of tandem repeat (VNTR) loci (3-5) and b) amplification of specified genetic loci by the polymerase chain reaction (PCR) (6) and subsequent typing of specified genetic markers (7-13). Any material that contains nucleated cells, including blood, semen, saliva, hair, bones, and teeth, potentially can be typed for DNA polymorphisms. The typing of VNTR loci by RFLP analysis is the most discriminating, or individualizing, molecular biology technology for forensic identity testing. Although this approach is valid and reliable for forensic and paternity testing, it has certain limitations. These include: 1) a sufficient quantity of high molecular weight DNA (usually at least 50 ng) is required for RFLP analysis (14); 2) samples that have been substantially degraded can not be analyzed by RFLP typing; and 3) RFLP analysis is laborious as well as time-consuming, requiring two to eight weeks to obtain results on six VNTR markers.
An alternative strategy for forensic DNA typing is the use of PCR-based assays (6)...

At the STR Project meeting on November 13-14, 1997 the core loci for the national system were agreed upon by the participating laboratories. The 13 core loci are: CSF1PO, FGA, TH01, TPOX, vWA, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S359, D18S51, and D21S11. To take full advantage of the power of STR typing and to ensure compatibility for the searching of DNA profiles, all participants agreed that typing all 13 STR loci would be attempted when analyzing casework...

However, there initially were different viewpoints among the participants of the STR project regarding the manner that the 13 loci would be analyzed for convicted offender samples. There were two general approaches proffered: 1) initially type all 13 STR loci for convicted offender samples prior to submission to NDIS, or 2) type a subset of the loci initially, send the subset profile to NDIS, and only type the remaining STR loci to resolve provisional "hits." Some factors considered in assessing these two approaches were cost, labor, potential sources of error, individualization potential, ability to resolve mixture samples, response time, investigative lead potential, public impression, avoidance of endorsing a particular manufacturer's STR typing kit, which loci to use in the initial subset, and database size and growth.

The advantages of typing convicted offender samples for all 13 STR loci are: the augmented discrimination potential even for mixtures or degraded samples, fewer provisional hits, better quality control, defined budgets, faster response time to the police, less documentation, enhanced nation-wide compatibility, minimal subsequent investment, no consequence of sample loss preventing a subsequent analysis, defined message to manufacturers, and sufficient data to obviate differences that may arise among primer sets from the different manufacturers.

The factors initially considered as disadvantageous to typing all 13 STR loci are: increased cost and labor, fewer laboratories to participate initially and fewer samples to be typed, the lack of availability of all 13 STR loci in commercial typing kits from a single manufacturer, and the increased time required to type all loci.


Here's the FBI page about CODIS and the DNA analysis group.

Unless the federal government has started genotyping all citizens, CODIS presumably derives its "background" or "Population" file from the collection of genetic information from forensic samples collected at crime scenes, and federal convict DNA samples.

I work in a lab that works on the human genome. Nearly all of the statistical tests we run have to be careful about what we use for a "background" — this background is used to differentiate a true "hit" from random chance. I'd be curious to learn more about how the FBI generates that background, and how false positives are tested for, if at all.

Aside from that, a good defense lawyer could raise questions about how the forensic DNA evidence was collected and handled. This was one of the defenses raised by OJ Simpson's legal team, for example, when Detective Mark Fuhrman planted evidence on the crime scene.
posted by Blazecock Pileon at 3:41 PM on July 20, 2008 [2 favorites]


The men matched at nine of the 13 locations on chromosomes, or loci, commonly used to distinguish people...But the mug shots of the two felons suggested that they were not related...One was black, the other white.

This is not foolproof.
posted by Knappster at 4:03 PM on July 20, 2008


After the judge, Steven Platt, rejected her arguments, [the state's DNA administrator Michelle] Groves returned to court, saying the search was too risky. FBI officials had now warned her that it could corrupt the entire state database, something they would not help fix, she told the court.

Yes, this process does appear vulnerable to corruption...
posted by ryanrs at 5:11 PM on July 20, 2008


I think Floyd Landis is up on this research.
posted by caddis at 6:36 PM on July 20, 2008


In the 1990s, FBI scientists estimated the rarity of each genetic marker by extrapolating from sample populations of a few hundred people
... and yet, they're apparently uninterested in getting any new data on that by looking at the whole set now.

I wonder how [does one find out how] many people have been convicted based primarily (or entirely) on their DNA "profile"; Wikipedia's genetic fingerprinting article says the first in the U.S. was in 1987. If it turns out conclusively that DNA profiles have been accompanied in the courts by some extremely faulty statistics regarding the probability of false positive matches, shouldn't there soon be a large number of prisoners needing to be released and/or retried?

Could our criminal justice system actually have been relying on a naive hash function for over two decades?!
posted by finite at 7:45 PM on July 20, 2008


This was one of the defenses raised by OJ Simpson's legal team, for example, when Detective Mark Fuhrman planted evidence on the crime scene.

Please clarify. Do you believe Fuhrman planted evidence?
posted by Cool Papa Bell at 8:19 PM on July 20, 2008


Please clarify. Do you believe Fuhrman planted evidence?

Judge Ito believed it in OJ Simpson's criminal case; as did Judge Fujisaki in the civil case, due to Mark Fuhrman's plea of no contest to perjury.

There are ways that DNA evidence can questioned in court, above and beyond the technical and statistical problems of associating a forensic sample with the defendant's genetic constitution (contamination during the PCR amplification step, and interpretation problems afterwards — what this post is about). I didn't particularly care about Simpson's innocence or guilt in raising the point.

The simple presence of evidence is itself not conclusive, especially if one of the detectives involved is a known bigot and would have motive to plant evidence on the accused. That the evidence is DNA doesn't necessarily ease the burden of proof.

But since you're asking me directly: Yes, it seems probable that Fuhrman planted evidence, and I do believe he did this. That did not change my opinion of guilt or innocence of this particular defendant, which seemed irrelevant to the discussion of the use of forensic DNA fingerprinting.
posted by Blazecock Pileon at 9:22 PM on July 20, 2008 [2 favorites]


Heh. I thought about posting this, but I thought I should do the math about the birthday paradox first...
posted by Pronoiac at 10:53 PM on July 20, 2008


[some namecalling removed - metatalk for that, thanks.]
posted by jessamyn at 8:11 AM on July 21, 2008


« Older Dino Valls (NSFW) (large format slide show of his ...  |  Festooning The Tree Of Life.... Newer »


This thread has been archived and is closed to new comments