Prejudice in Forensic Science, Part 2: Dr. Dror’s Dilemma

Amanda Knox

Exoneree, journalist, public speaker and author

In my previous article, I outlined the basics of cognitive bias: how it operates at a subconscious level and can influence decisions among even the most logical-minded, scientific, professionals—including supposedly impartial medical examiners and detectives. I also introduced the work of Dr. Itiel Dror, whose research reveals how such bias enters into decisions on autopsies and causes of death that should be neutral, fact-based science. Here, I’ll unpack the campaign against his work from the forensics science professionals who sought to silence him.

***

It started with a paper he published in February 2021 in the Journal of Forensic Sciences, which got notable publicity from the Washington Post. In Dr. Dror’s study, 133 medical examiners and/or forensic pathologists were given identical medical data about a toddler who’d died upon arrival at the hospital. But as with his fingerprint study, the context was altered. In one variation, the child was white and had been brought to the hospital by their grandmother. In another, the child was Black and was brought to the hospital by their mother’s boyfriend. The participants were tasked with determining the manner of death. The results were overwhelming. In the Black child/mother’s boyfriend scenario, the examiners were five times more likely to rule the death a homicide than an accident, while in the white child/grandmother scenario, they were twice as likely to rule the death an accident than a homicide.

The backlash was swift. The President of the National Association of Medical Examiners (NAME), Dr. James Gill, filed a formal complaint with the University College of London, where Dr. Dror is a professor. “The complaint was very nasty and personal,” Dr. Dror says. In the complaint, Dr. Gill referred to Dr. Dror’s research as deceptive and showing a flagrant disregard for ethics. This was largely because Dr. Dror did not let the study participants know ahead of time that the study was about cognitive bias. Dr. Gill complained that all the participants “were misled and now essentially are branded as biased by race in our manner of death determinations.” Dr. Gill went on to criticize Dr. Dror’s methodology, attacking the study design and statistical analysis, writing that “the study is embarrassing for science, the authors, and any institution affiliated with it,” and that Dr. Dror made mistakes that “a junior trainee should rightly be reproached for” and that “should never be utilized by a Full Professor.”

Dr. Gill, writing on behalf of NAME, which is the leading organization of medical examiners, demanded that “This work must stop immediately. We urge your body to take immediate and decisive action to assure this work is not directed by, encouraged or done by individuals at your institution.”

While being attacked at his place of employment, the National Association of Medical Examiners also went after Dr. Dror in the pages of The Journal of Forensic Sciences. A group of over 80 forensic pathologists, including Dr. Gill, wrote a letter to the editor saying that Dr. Dror’s paper was fatally flawed, should be retracted, and that it represented an abject failure of the peer review process. “It is academically vacuous, intellectually dishonest, and intentionally deceptive,” they wrote. Dr. Dror and his colleagues responded, and the ensuing back and forth led to nine commentaries and twenty-two letters, more than any prior article published by JFS in over sixty years.

I reached out to Dr. Gill for comment, but he declined, so I dug into these letters and commentaries by him and his colleagues to see if there was any merit to the complaints against Dr. Dror’s study.

In short, they attacked Dr. Dror’s work on every front imaginable. They complained about how he reached out to participants on NAME’s membership rolls, not seeking the organization’s approval first. Dr. Dror noted in response that NAME had previously refused to distribute a survey on cognitive bias to its members, leading him to reach out to pathologists and medical examiners independently.

They made a big deal that the participants, though all NAME members, were not necessarily board-certified forensic pathologists (even if they are the vast majority), but an anonymous group that included coroners, medical examiners, and others who make manner of death determinations. “To be clear,” Dr. Dror replied, “our study is not about forensic pathologists per se, but about forensic pathology decisions, and in particular, manner of death decisions, which are often made by a variety of people.” And indeed, the study very explicitly stated that it was using the terms “coroner,” “forensic pathologist,” and “medical examiner” interchangeably, a practice used in other papers published in JFS without reproach. “NAME’s fixation with how many of our participants were board-certified forensic pathologists vs. medical examiners or coroners is ill-informed and misses the point,” Dror wrote. “Their argument is predicated on the fallacious assumption that expertise, training and/or certification protect against bias, which it unequivocally does not.”

The two most important critiques from the National Association of Medical Examiners, those most worth addressing at length, centered on the definition of “medically relevant information” and on the inclusion of race as a variable in the scenarios.

“The first, and perhaps the worst error made by the authors,” they wrote, “is the statement, unattributed and untrue, that caretaker relationships are ‘medically irrelevant.’” The letter cited extensive medical literature and consensus to the contrary, citing one study that found that the “odds ratio of abuse in the case of a boyfriend caretaker was 169.2, while in the case of a grandmother, it was 0.34. Thus, the odds ratio for the boyfriend is 497 times that of the grandmother.” Another study they cited found that biologically unrelated male caretakers are more likely than genetically related fathers to use blunt force trauma when killing children in their care. “To claim that the combination of biologically unrelated male caregiver and blunt trauma death is ‘medically irrelevant’ flies in the face of the medical literature and established clinical practice,” they wrote.

Essentially, they maintained that the contextual information about the caregiver is relevant—and in fact essential—to their medical findings about the manner of death. While this sounds reasonable at first glance—it wasn’t obvious to me that they shouldn’t be forming conclusions based in part on that information—Dror very incisively lays out the problem with this practice, regardless of whether it is established and widely accepted. It’s worth quoting him at length:

Concluding that an individual case was a homicide because the risk to a child based on contextual data gleaned from large groups is a known bias: the ecological fallacy [that] results from making a causal inference about individual phenomena on the basis of observation of groups.

The question asked of each participant in the survey was not whether boyfriends are generally more likely to kill children than grandmothers, but whether, given equivocal medical information, this individual boyfriend or grandmother killed this individual child or the child died as the result of an accident.

Furthermore, the statistical data about non-biological caregivers, whatever it shows, may be a result of self-perpetuating bias…An analogous self-perpetuating bias is the policing of Black people. Police suspect that Black people are more likely to have drugs or carry guns because proportionally more Black people are convicted of and serving time for such offenses. Hence, police stop and search Black people more often. As a result, Black people are therefore more often arrested, convicted, and incarcerated for firearm and drug offenses—precisely because police stop and search Black people more often. As this cycle of bias repeats and feeds itself, the bias perpetuates and gets stronger—a phenomenon known as the bias snowball effect.

The critics have a reasonable reply to this. “Were we to apply the author’s assertions, medical diagnosis would be impossible…All diagnoses are probabilistic, all diagnoses apply general statistics to individual cases, and all diagnoses work by making explicit or implicit assumptions about the absence of intervening factors.” They also note that manner of death determinations were created for public health statistics, not for use in a court of law, and thus social and contextual information is relevant.

But whatever the origin of the practice of manner of death determinations, it is nonetheless true, as Dr. Dror noted in response, that “manner of death determinations in individual cases are also a basis to initiate criminal investigations that frequently result in charging people with crimes. Thus, manner of death determinations have grave consequences far beyond ‘statistical purposes.’”

And while it may be true that any medical diagnosis depends upon the so-called ecological fallacy—making causal inferences about an individual based on group statistics—in the case of medical diagnoses, the patient retains autonomy, can seek a second diagnosis, and must consent to any intervening procedures recommended on the basis of that diagnosis. A finding of homicide in a manner of death examination, on the other hand, can lead to an immediate loss of autonomy—arrest, conviction, and imprisonment. There’s good reason that the standards should be different in these cases, and the dispute between Dr. Dror and the National Association of Medical Examiners here appears to result from the fact that manner of death investigations straddle the line between the medical world and the criminal justice world.

The National Association of Medical Examiners insists that this isn’t their problem. “The fact that this tool for aggregate statistics often does not fit well in court is not a criticism of manner determination by forensic pathologists. It is instead a criticism of misuse of manner determination by the courts.” Dr. Dror contends that regardless of how the practice originated, or what is considered standard procedure in the medical community, that procedure is flawed because it can lead to wrongful convictions.

The second major critique leveled at Dr. Dror’s study was the inclusion of race as a variable alongside the variable of caretaker. “The focus on race in this article,” Dr. Gill et al wrote, “moves the construction of the study from inexplicable to absurd.” Their complaint is both statistical and personal. “These authors essentially conflated caretaker relationship and race to provide themselves with an opportunity of making accusations of race bias,” literally claiming that this was an effort by Dr. Dror “to label the survey responders, and their colleagues by proxy, as racists.”

And here we reach the cause of the intense backlash against Dr. Dror’s work—nobody wants to be called a racist. The statistical critique from NAME is that the variables of race and caretaker status have been confounded. “This is particularly egregious because they do not dispute that had they reversed the race in their example, it would show the opposite result that they claim—that respondents would have called “White” cases homicide more often.” The claim here is essentially that the survey respondents were not at all being influenced by the race variable, but they were being influenced by caretaker status, because they were appropriately (in NAME’s view) taking that information into account in making their manner of death determination.

The original study, however, made it very clear that “the data do not allow us to ascertain whether they were biased by the race of the child or/and characteristics of the caretaker.” Dr. Dror contends that there is a misconception that they manipulated two variables, race and caretaker identity, when in fact they only manipulated one: the presence of non-medical irrelevant information. He offers an analogy:

Imagine we did a study examining one variable: whether food intake impacts weight. We take a group of people and deprive them of chocolates and make them eat lots of vegetables. We find out that they lost weight, and our conclusion is that food intake impacts weight. Yes, we cannot ascertain whether it is the reduced consumption of chocolates or/and the extra vegetables that underpin the weight loss.

But vegetables and accusations of racism do not have the same emotional resonance. While I’m sympathetic to how painful it is to be branded as a racist—I’ve been called racist for the coerced admission I signed—any survey-based study on bias can’t reveal what it is studying to its participants without nullifying its results. Why did Dr. Dror et al include multiple variables? One reason is to avoid the Hawthorne effect, where a subject’s behavior is altered by their awareness of being observed. As Dr. Ken Obenson wrote in one of these twenty-two letters published in JFS, “Had the study been designed to apply the same set of variables to both Black and White decedents with their race as the sole constant, it may have reached different conclusions. Notwithstanding, it is likely that many participants would have been clued into the objective of the study and modified their responses accordingly.”

The most generous interpretation I can make of NAME’s critique is this: the survey results weren’t influenced by the variable of race, but by caregiver identity, as they should have been, but now everyone is going to call us racists! But also, you can’t say anything definitive about forensic pathologists unless you are looking at a cohort that is 100% composed of board certified forensic pathologists, because true experts…would have given different results?

This is where the response from NAME becomes contradictory, and takes on that Trumpian flavor of “I didn’t do it, but if I did, there’s nothing wrong with that.” The prosecution in my own case did a similar thing regarding the knife. They asserted, contrary to evidence, that it did fit all of Meredith’s wounds, but tacked on “there was probably a second knife we haven’t found.” NAME’s emphasis on procedure, labeling, and expertise imply that they disagree with the study’s findings, but in the next breath they assert that forensic pathologists should be influenced by what Dr. Dror considers non-medical information, which they were, just not by the race variable.

Next, in the final segment of this three-part series, I’ll cover the fallout of NAME’s campaign against Dr. Dror, and the changes he believes we could make to create a fairer, more impartial way of approaching the science behind criminal justice.