Yeah, No, That Study Doesn’t Debunk Police Racism

Sloppy social science and the mental gymnastics of racism deniers.

Some people will say anything to deny the problem of racism in policing.

These are people who would have found ways to defend Bull Connor in Birmingham too, or Jim Clark and his goons in Selma six decades ago.

One thing about their denials has changed though — they’ve become more sophisticated.

Increasingly, such folks wrap their denial in a patina of respectable “evidence,” whereas, back in the day, they would have just said something about how those n-words were asking for trouble and left it at that.

But bullshit, even when footnoted, is still bullshit.

White racism deniers love ’em some Roland Fryer

My favorites are the white folks who send around the study from a few years ago by Roland Fryer, a Harvard academic, which concluded police were no more likely to use lethal force against Blacks than whites.

They love this one because Fryer is Black.

Apparently, if a Black guy says there’s no racism in policing — or if that’s what they think he’s saying — there must not be.

It’s funny — first, because conservative white people are so quick to latch on to any Black person who they think confirms their nonsense, and second, because they don’t understand what the Fryer study says, why much of it doesn’t support their view, and why the part that does is seriously flawed.

The Fryer study looked at four data sets, mainly focusing on three: stop-and-frisk data from New York City, data from 12 large cities or counties in Texas, Florida, and California, and a special data set from Houston.

The racism deniers focus on the finding that there was no racial disparity in use of lethal force, but before examining the data used to reach that conclusion, it’s worth looking at what the deniers ignore.

Non-lethal force shows clear disparity

Looking at non-lethal force, Fryer relied on stop-and-frisk data from New York for 2003–2013 and found that Black New Yorkers were 53 percent more likely than whites to be met with non-lethal force by the NYPD.

Interestingly, when he controlled for variables like civilian behavior during the stop — did they resist arrest, for instance — or the neighborhood crime rate, not only did this not reduce the disparity, it sometimes increased it.

This means police were using force against African Americans even in cases where they put up less resistance and in parts of town where crime rates were not elevated.

Nonetheless, when Fryer controlled for 125 supposedly non-racial variables, the observed disparity in non-lethal force fell from 53 percent to 17 percent — still significant, albeit less so.

But how is this possible?

If the disparity remained huge even when suspect behavior and neighborhood crime rates were held constant, what variables could have had such a depressive effect on disparity?

We don’t know for sure. The complete list wasn’t provided in Fryer’s paper. But what we do know about them is methodologically troubling.

Consider his controls for “community dangerousness.”

As noted previously, Fryer examined the neighborhood crime rates and actual suspect behavior during encounters because these would predictably increase the likelihood of police use of force.

But remember, neither of these controls reduced the racial disparities and tended to increase them.

So, where did the reductions come from?

According to Fryer, three “precinct effects” cut racial disparities in the use of force by nearly 20 percentage points — more than a third below their initial level. And what were those?

According to Fryer, they were socioeconomic variables often correlated with crime rates: median education levels, median income, and median levels of unemployment in a neighborhood. As Fryer puts it, these are “proxies for dangerousness.”

But why control for “proxies for dangerousness” when you’ve already controlled for neighborhood crime rates and the behavioral dynamics of particular stops?

At that point, Fryer has already controlled for dangerousness and by a more direct method than using socioeconomic proxies to estimate it.

If the crime rate in a neighborhood fails to explain the racial disparity, controlling for variables that are often correlated with a higher crime rate is superfluous. And if actual encounter dynamics failed to explain the racial disparity, controlling for variables that might predict greater resistance by civilians is equally absurd.

Either the person who was stopped resisted or they didn’t. If they had, Fryer would have already controlled for that. If they didn’t, the fact that there are many unemployed high school dropouts living on the block can hardly justify cops throwing someone who isn’t resisting against a wall.

Ultimately, even though he artificially minimizes the problem, Fryer’s data shows Black folks are much more likely to be handled violently by police. And this is so, even when they put up less resistance, comply with all demands, have no weapons, and have committed no crime.

Of course, this finding is ignored by those who point to Fryer’s research as vindication of their racism denial.

Lethal force data shows disparity too — Fryer’s data sets are garbage

When we look at Fryer’s data on lethal force, his conclusions are dubious to the point of being laughable.

First, let’s look at the data set from Houston, which consisted of interactions where officers fired at suspects or specific high-risk arrest scenarios where lethal force would have been most likely.

The Houston data doesn’t disprove racial bias

Here, Fryer discovered no real racial difference in the likelihood that Blacks, as opposed to whites, were shot by police once subjected to a stop or arrest.

But a central flaw in Fryer’s analysis is the suggestion that bias can only be operating if Black people are more likely to be shot by police than whites once both have been stopped.

Although such a position may seem intuitive, it doesn’t hold up to scrutiny for two reasons:

  1. Racism can influence who gets stopped in the first place — and thus, how many encounters there are between cops and Blacks versus cops and whites — and,
  2. Police could be confronting Black folks for more subjective, less legitimate reasons.

If the latter is true, this would naturally reduce the likelihood of those Black people being shot because they weren’t doing anything serious. Thus, there would be less likelihood of a violent reaction by the Black person stopped.

Don’t like ads? Become a supporter and enjoy The Good Men Project ad free

If I’m Black and you stop me because of racialized suspicion and bias, and our encounter doesn’t result in a shooting — which it shouldn’t since I hadn’t even done anything to justify the stop — you can’t use your lack of deadly force against me as proof of goodwill.

And if police are more likely to stop Black folks in the first place for reasons of bias, then the risk they face in the general population would still be higher.

A hypothetical can demonstrate the point.

Imagine a community where the white-to-black population ratio is 5 to 1 (similar to the U.S.), with 120,000 people: 100,000 whites and 20,000 Blacks.

 

EXAMPLE: 1/200  vs 1/20000

And imagine that in a given year, police stopped 10,000 Black people (half the Black population) and 5,000 whites (5 percent of white folks). And of the 10,000 Blacks stopped, 100 were shot by police, and of the 5,000 whites stopped, 50 were.

In both cases, the odds of being shot once stopped would be one percent, but 1 in 200 Blacks would have been shot, compared to 1 in 2000 whites.

The question isn’t, “Once whites are stopped, are they as likely as Black people who’ve been stopped to be shot?”

The question is: “Are white people, walking down the street, driving their vehicle, or just living their lives, as likely to be stopped in the first place and then shot as Black people?”

The answer to that is no, and nothing in the Fryer study suggests otherwise.

The 10-city data set is no better

In addition to the special data set culled for him by the Houston PD, Fryer examined a 10-city data set from Florida, Texas, and Los Angeles involving interactions where officers had discharged their weapons.

Since everyone in the data set had been shot at by police, Fryer wasn’t seeking to determine the relative risk of whites or Blacks being shot by cops, but rather, how quickly officers had discharged their weapons.

Did police shoot before or after being attacked by the civilian? Ultimately, Fryer found there was no significant difference based on race.

Perhaps the question of how quickly an officer decided to shoot is an interesting one to explore. Still, it seems far more important to determine the relative risk of being shot as an unarmed Black person compared to an unarmed white person than to narrowly focus on a cop’s reaction time.

Although Fryer suggests it would have been impossible to answer this larger question, other researchers have been more ambitious.

One recent study found that the odds of being Black, unarmed, and shot by police in Los Angeles county (one of the places Fryer examined) are twenty times higher than the odds of being white, unarmed, and shot by police there.

And honestly, what fact do you think would be more important to the average Black person?

  1. When they get shot by cops, unarmed whites are shot just as quickly as unarmed Blacks are, or
  2. Unarmed white people are only one-twentieth as likely as unarmed Blacks to be shot in the first place.

I’ll wait.

Don’t like ads? Become a supporter and enjoy The Good Men Project ad free

Reaction time differences are a stupid metric in that we shouldn’t expect them to vary all that much, especially in high-risk situations like the ones examined by Fryer. An officer doesn’t have the luxury of much reflecting when a gun is pointed at them, or they’re being attacked, no matter the suspect’s race.

But that hardly means that racial bias wasn’t operating at the point where the person was stopped in the first place.

Nor does it preclude bias regarding whether the officer perceived danger and chose to fire at all.

Imagine a community where police shot 500 black people in a year, 300 of whom were attacking, and 200 of whom were not; and only five whites, three of whom were attacking and two of whom were not. As per Fryer, there would be no racial bias: for both groups, 60 percent of the shootings occurred after the officer was attacked and 40 percent before an attack.

But seriously? Does it seem remotely logical to suggest there isn’t a problem here in terms of greater risk for black people, relative to their share of the population and non-attacking population?

What the facts say, deniers notwithstanding

The facts are these, no matter what liars and fools choose to believe:

  • Black folks killed by police are 2.3 times more likely than whites killed by police to have been unarmed at the time, and whites killed were about 50 percent more likely than black victims to have been shot while attacking the officer or another civilian.
  • Likewise, the rates of police-involved shootings bear little if any relationship to crime rates in the places where those shootings occur. This is why some communities with much higher crime rates have lower rates of police-involved shootings than cities with less serious crime problems.
  • Ultimately, police are just as likely to shoot an unarmed black person as an armed white person in this country.

That’s what matters — not the beliefs of internet trolls looking for any “evidence” to justify their biases and ways to rationalize disparate treatment of Black people.

Not that facts will likely matter to the kinds of folks who make these silly arguments.

But at least you can’t say we never offered a rebuttal to Roland Fryer and his white conservative fan club.

Some of y’all need to find a better mascot.

Race, Policing, and the Limits of Social Science

In 2016 economist Roland G. Fryer, Jr., the youngest African American ever to be awarded tenure at Harvard, came upon what he would call the “most surprising result of my career.” In a study of racial differences in the use of force by police officers, Fryer found that Black and Hispanic civilians were no more likely than white civilians to be shot to death by police. “You know, protesting is not my thing,” Fryer told the New York Times. “But data is my thing. So I decided that I was going to collect a bunch of data and try to understand what really is going on when it comes to racial differences in police use of force.”

The entwining of data and theory is especially fraught in the study of race.

Three thousand hours later, after meticulous records collection and analysis, the data had spoken. Although Black people are significantly more likely to experience non-lethal force at the hands of police than white people in similar situations, Fryer concluded, there is no racial bias in fatal police shootings. The findings appeared to have direct implications for the growing protest movement that had swept across the United States following the police killings of Michael Brown, Eric Garner, Tamir Rice, Freddie Gray, Philando Castile, among so many others. “It is plausible that racial differences in lower-level uses of force,” Fryer wrote at the end of the paper, “are simply a distraction and movements such as Black Lives Matter should seek solutions within their own communities rather than changing the behaviors of police and other external forces.”

The study quickly came under fire. For the most part, critics took one of two tacks. One was to argue that the research failed on its own technical terms: the data were erroneous or misleading; there was a mathematical error in the analysis; the statistical protocol was inappropriate. The other tack was to undermine the legitimacy of the effort on auxiliary grounds, pointing out that economists are not experts in the study of police shootings and that the profession of economics suffers from a ­­­conservative bias. These two types of reply exemplify a common pattern of response to the results of quantitative social science, and they are also once again on wide display in national conversations about policing. But they illustrate a deep problem with the way we think about the nature of social scientific inquiry—and, consequently, its capacity to inform our thinking about politics and policy.

On these two views, scientific method is either so airtight that only errors from within can undermine it or so porous that its validity turns entirely on outside interests. There are certainly cases of each kind of blunder: recall the Reinhart-Rogoff Excel spreadsheet snafu on the one hand, tens of millions of dollars funneled by ExxonMobil to fund climate change denialism studies on the other. We misunderstand run-of-the-mill scientific practice, however, if we view it as either everywhere or nowhere settled by data. As historians and philosophers of science have long emphasized, “the data” can never take us all the way from observation to conclusion; we can interpret them only against some background theory that settles what the data are evidence of. Far from playing no role in quantitative social science, a shared set of theoretical and normative commitments is what allows data-first methods to work at all.

What we should believe—and what we should give up believing—can never be decided simply by brute appeals to data.

This entwining of data and theory runs through any application of quantitative methods, but it is especially fraught today in the study of race. Since the 1970s, the development of causal inference methodology and the rise of large-scale data collection efforts have generated a vast quantitative literature on the effects of race in society. But for all its ever-growing technical sophistication, scholars have yet to come to consensus on basic matters regarding the proper conceptualization and measurement of these effects. What exactly does it mean for race to act as a cause? When do inferences about race make the leap from mere correlation to causation? Where do we draw the line between assumptions about the social world that are needed to get the statistical machinery up and running and assumptions that massively distort how the social world in fact is and works? And what is it that makes quantitative analysis a reliable resource for law and policy making?

In both academic and policy discourse, these questions tend to be crowded out by increasingly esoteric technical work. But they raise deep concerns that no amount of sophisticated statistical practice can resolve, and that will indeed only grow more significant as “evidence-based” debates about race and policing reach new levels of controversy in the United States. We need a more refined appreciation of what social science can offer as a well of inquiry, evidence, and knowledge, and what it can’t. In the tides of latest findings, what we should believe—and what we should give up believing—can never be decided simply by brute appeals to data, cordoned off from judgments of reliability and significance. A commitment to getting the social world right does not require deference to results simply because the approved statistical machinery has been cranked. Indeed in some cases, it may even require that we reject findings, no matter the prestige or sophistication of the social scientific apparatus on which they are built.


An instructive object lesson in these issues can be found in a controversy ignited last summer, when a paper published in the American Political Science Review (APSR) in late May questioned the validity of many recent approaches to studying racial bias in police behavior, including Fryer’s. A very public skirmish among social scientists ensued, all against the backdrop of worldwide protests over the murder of George Floyd by Minneapolis police officer Derek Chauvin.

The APSR paper focused, in particular, on the difficulties of “studying racial discrimination using records that are themselves the product of racial discrimination.” The authors—Dean Knox, Will Lowe, and Jonathan Mummolo—argued that

when there is any racial discrimination in the decision to detain civilians—a decision that determines which encounters appear in police administrative data at all—then estimates of the effect of civilian race on subsequent police behavior are biased absent additional data and/or strong and untestable assumptions.

The trouble, in short, is that “police records do not contain a representative sample” of people observed by the police. If there is racial bias reflected in who gets stopped and why—and we have independent reason to believe that it does—then police data for white and non-white arrestees are not straightforwardly comparable without making additional implausible or untestable assumptions. Such “post-treatment bias” in the data would thus severely compromise any effort to estimate the “true” causal effects of race on law enforcement behavior, even if we are only interested in what happens after the stop takes place. “Existing empirical work in this area is producing a misleading portrait of evidence as to the severity of racial bias in police behavior,” the authors conclude. Such techniques “dramatically underestimate or conceal entirely the differential police violence faced by civilians of color.” The authors therefore call for “future research to be designed with this issue in mind,” and they outline an alternative approach.

Where do we draw the line between assumptions needed to do the statistics and assumptions that distort how the social world really works?

A critical response by several other scholars—Johann Gaebler, William Cai, Guillaume Basse, Ravi Shroff, Sharad Goel, and Jennifer Hill—appeared a month later, in June; for simplicity, call this group of scholars the second camp. Disputing the APSR authors’ pessimistic assessment of research on racial bias in policing, they countered that the APSR paper rested on a “mathematical error.” The usual methods could still recover reliable estimates of the causal effect of race on law enforcement behavior after a stop has been made, even if police stops are themselves racially biased. The error, they assert, lay in assuming that certain conditions had to be assumed in order to make reliable estimates using data like Fryer’s. In fact, these scholars wrote, a weaker statistical condition—what they term “subset ignorability”—would also suffice, and it was more likely to hold, “exactly or approximately,” in practice. They then attempt to show how the standard causal estimation techniques can be saved by putting forth their own analysis of racial bias in prosecutors’ decisions to pursue charges (again relying on the sort of data from police records that the APSR authors find problematic).

In the days following this exchange, what ensued can only be described as a high-profile statistical showdown on Twitter, punctuated by takes from interested onlookers. The second camp mounted a defense of the mathematics, arguing that progress in statistical methods should not be foreclosed for fear of unobservable bias. In a policy environment that increasingly looks to quantitative analyses for guidance, Goel wrote, “categorically barring a methodology . . . can have serious consequences on the road to reform.” The APSR authors, by contrast, emphasized what they took to be the purpose of applied social scientific research: to provide analysis at the service of real-world policy and practical political projects. Knox, for example, wrote that their critics’ argument “treats racial bias like a game with numbers.” Instead, he went on, he and his co-authors “use statistics to seek the best answers to serious questions—not to construct silly logic puzzles about knife-edge scenarios.” This is no time, the APSR authors argued, to fetishize mathematical assumptions for the sake of cranking the statistical machinery.


What are we to make of this debate? Despite the references to mathematics and the sparring of proof-counterexample-disproof, which suggest a resolution is to be found only in the realm of pure logic, the dispute ultimately comes down to a banal, congenital feature of statistical practice: the plausibility of the assumptions one must make at the start of every such exercise. For the APSR authors, even the second camp’s weaker assumption of subset ignorability fails the test of empirical scrutiny: to them it is clearly implausible as a matter of how the social world in fact is and works. Ironically though, given their forceful criticism of the APSR paper, the second camp comes to the same conclusion in their own analysis of prosecutors’ charging decisions, conceding that “subset ignorability is likely violated”—thus rendering their own results empirically suspect.

We cannot justify our use of implausible assumptions solely on the basis of mathematical convenience.

This curious episode demonstrates how the social scientist is so often trapped in a double bind in her quest to cleave to her empirical commitments, especially when it comes to the observational studies—as opposed to randomized experiments—that are the bread and butter of almost all quantitative social science today. Either she buys herself the ability to work with troves of data, at the cost of implausibility in her models and assumptions, or she starts with assumptions that are empirically plausible but is left with little data to do inference on. By and large, quantitative social science in the last two decades has taken the former route, thanks in significant part to pressure from funding incentives. If implausible assumptions are the price of entry, the Big Data revolution promises the payment is worth it—be it in profit or professional prestige. As the mathematical statistician David A. Freedman wrote, “In the social and behavioral sciences, far-reaching claims are often made for the superiority of advanced quantitative methods—by those who manage to ignore the far-reaching assumptions behind the models.”

But if the social scientist is genuinely committed to being empirical, this choice she must make between plausible assumptions and readily available data must itself be justified on the basis of empirical evidence. The move she winds up making thus tacitly reveals the credence she has toward the theories of the social world presently available to her, or at least the kind of commitments she is willing to be wrong about. Precisely to the extent that social science is something more than mathematics—in the business of figuring out how the world is, or approximately is—statistical assumptions can never shake off their substantive implications. The requirement that social science be truly “evidence-based” is thus extremely demanding: it means that we cannot justify our use of implausible assumptions solely on the basis of mathematical convenience, or out of sheer desire to crank the statistical machinery. It is only in the belief that our assumptions are true, or true enough, of the actually existing world that social science can meet this exacting demand.

Notice the role that normativity plays in this analysis. If, as the first step to embarking on any statistical analysis, the quantitative social scientist must adopt a set of assumptions about how the social world works, she introduces substantive theoretical commitments as inputs into her inquiry. This initial dose of normativity thus runs through the entire analysis: there is simply no escaping it. Whether any subsequent statistical move is apt will depend, in however complex ways, on one’s initial substantive views about the social world.

A shared set of theoretical and normative commitments is what allows data-first methods to work at all.

What do these reflections mean in the specific case of research on race and policing? Whether one has in fact distilled the causal effect of race on police behavior in any particular study will depend on what one believes to be true about the racial features of policing more broadly. And since what positions you take on these matters depend on your background views regarding the prevalence and severity of racial injustice as an empirical phenomenon, whether a finding ends up passing statistical muster and therefore counts as an instance of racially discriminatory police action will depend on your broader orientation to the social world.

The upshot of these considerations is that statistical analysis is inescapably norm-laden; “following the data” is never a mechanical or purely mathematical exercise. But this fact should not lead us to discard any commitment to empirical validity as such. On the contrary, it should simply serve to remind us that standards of empirical scrutiny apply throughout the whole of any methodology. As Freedman put it, “The goal of empirical research is—or should be—to increase our understanding of the phenomena, rather than displaying our mastery of technique.”


One important consequence of this orientation, I think, is that we ought to subject not just assumptions but also conclusions to empirical scrutiny. To some observers of our social world, the conclusion that there is no causal effect of race in police shootings is not only implausible: it is simply and patently false. For even a cursory glance at descriptive summary statistics reveals wide gulfs in the risk of being killed by police for Blacks compared to whites. According to one study, Black men are about 2.5 times more likely to be killed by police than white men, and data from the 100 largest city police departments show that police officers killed unarmed Black persons at four times the rate of unarmed white persons—statistical facts that speak nothing of the immense historical record of overtly racist policing, which does not lend itself so easily to quantification. If certain methods erase these stark (and undisputed) disparities, painting a picture of a social landscape in which race does not causally influence police shooting behaviors, then so much worse for those methods. From this vantage, failing to take account of the many different forms of evidence of decades of racialized policing and policymaking is not only normatively wrong. It is also empirically absurd, especially as a self-styled “evidence-based” program that seeks to illuminate the truths of our social world.

Rejecting a study’s methods on the basis of disagreement with its results is a completely legitimate inferential move.

This suggestion—that we sometimes ought to reject a finding on the grounds that it does not accord with our prior beliefs—might seem downright heretical to the project of empirical science. And indeed, there is some danger here; at the extreme, indiscriminate refusal to change our minds in the light of evidence reeks of a sham commitment to empirical study of the world. But the truth is that scientists reject findings on these sorts of grounds all the time in the course of utterly routine scientific practice. (For just one recent newsworthy example, consider a 2011 study that found evidence for extrasensory perception.) The move need not signal a failure of rationality; indeed it can often be a demand of it. Determining which it is, in any particular case, cannot be settled by asking whether one has been faithful to “facts and logic,” as so many like to say, or to the pure rigors of mathematical deduction.

Instead, when a scientific finding conflicts with one of our convictions, each of us must comb over what philosopher W. V. O. Quine so charmingly called our “web of belief,” considering what must be sacrificed so that other beliefs might be saved. And since our webs are not all identical, what rational belief revision demands of us will also vary. One man’s happily drawn conclusion (p, therefore q!) is another’s proof by contradiction (surely not q, therefore not p!). Or as the saying goes, one man’s modus ponens is another man’s modus tollens. Rejecting a study’s methods or its starting assumptions on the basis of disagreement with its results is a completely legitimate inferential move. We tend to overlook this feature of science only because for most of us, so much of the nitty-gritties of scientific inquiry have little direct bearing on our daily lives. Our webs of belief usually are not touched by the latest developments in science. But once scientific findings make contact with—and perhaps even run up against—our convictions, we become much more sensitive to the way the chain of reasoning is made to run.

The fact that good faith efforts at rationality might lead different people to different or even opposite conclusions is a basic, if unsettling, limitation of science. We cannot hope for pure knowledge of the world, deductively chased from data to conclusion without mediating theory. In the end, the Fryer study controversy has been one long object lesson in how our empirical commitments are invariably entangled with normative ones, including commitments more typically thought of as ethical or political. The choice to sacrifice empirical plausibility in one’s assumptions, in particular, is not just a “scientific” matter, in the oversimplified sense of “just the facts”: it is inevitably interwoven with our ethical and political commitments. In bringing one’s web of beliefs to bear on the debate over what constitutes proper study of effects of race in policing, one puts forth not just prior empirical beliefs about, say, the prevalence of racial targeting or the fidelity of police reporting practices, but also one’s orientation toward matters of racial justice and self-conceptualization as a researcher of race and the broader system of policing and criminal justice.

Good faith efforts at rationality might lead different people to different or even opposite conclusions.

For the APSR authors, bias in policing presents both enough of a normative concern and an empirical possibility to license, as a matter of good scientific practice, the sacrifice of certain business-as-usual approaches. The second camp, by contrast, is loath to make the leap to discard approaches held in such high esteem. Their commitment to the usefulness of the standard approaches runs so deep that they do not yet see sufficient cause for retreat. In a revision of their paper released in October, the authors remove the explicit assertion of a “mathematical error” but find “reason to be optimistic” that many cases of potential discrimination do meet the empirical conditions prescribed by the statistical assumptions proposed to salvage the usual approaches.

What exactly these reasons for optimism are remains unclear. By the second camp’s own admission, because “one cannot know the exact nature and impact of unmeasured confounding . . . we must rely in large part on domain expertise and intuition to form reasonable conclusions.” And yet without reference to any such further evidence or support, they nevertheless conclude: “In this case, we interpret our results as providing moderately robust evidence that perceived gender and race have limited effects on prosecutorial charging decisions in the jurisdiction we consider.” Such a claim ultimately says much more about their web of belief than about the actually existing social world.


For those whose beliefs, empirical and ethical, are forged in participation in radical sociopolitical movements from below, to be ill-inclined to accept certain findings about race and policing is to remain steadfast in a commitment to a certain thick set of empirical and ethical propositions in their webs of beliefs: that systems of policing and prisons are instruments of racial terror and that any theory of causation, theory of race, and statistical methods worth their salt will see race to be a significant causal factor affecting disparate policing and prison outcomes. This just is the first test of “fitting the data.” It is not a flight from rationality but an exercise of it.

We need a conception of social theory that is at once “empirical, interpretative, and critical.”

Does this view of social science transform an epistemic enterprise into a crudely political one? Does a readiness to sacrifice some scientific findings to save ethical or political commitments endanger the status of science as a distinctive project that seeks to produce new knowledge about the world? I think it doesn’t have to. Even the hardest-nosed empiricist starts from somewhere. She must interpret her data against some background theory that she takes to be the most natural, most plausible, and most fruitful. Deviations from this position that are self-consciously animated by politics need not be less genuinely truth-seeking than self-styled neutral deference to the status quo.

This fact tends to get lost in debates about where science sits along a continuum that runs from “objective” (protected from bias and outside interference) to “political” (a no-holds-barred struggle for power, the label of “science” slapped onto whatever the winner wishes). What that picture elides is how science unfolds in the trenches of knowledge production: in the methodological minutiae that determine which assumptions must be sacrificed and which can be saved, when abstraction leads to silly logic puzzles and when it is a necessary evil, which conclusions trigger double-takes and which signal paradigm shifts, and so on. To acknowledge that these struggles cannot take us beyond the never-ending tides of the “latest findings” is not to give up on quantitative social science as a venture for better understanding the world. It is simply to embrace a conception of social inquiry that is always, as philosopher Richard J. Bernstein put it, at once “empirical, interpretative, and critical.”

Here’s why I’m skeptical of Roland Fryer’s new, much-hyped study on police shootings

When Fryer (an economist by training) tells the Times that he got interested in police shootings because of “his anger after the deaths of Michael Brown and Freddie Gray,” and (in Fryer’s words) “decided I was going to collect a bunch of data and try to understand what really is going on,” that should be another humongous red flag.

AD

It implies that Fryer assumed he was doing something pioneering, rather than asking first what work was already being done and what he could add to the existing conversation. This is something that often happens when people in “quantitative” social sciences, like economics, develop an interest in topics covered in other social sciences — in this case, criminology: They assume that no rigorous empirical work is being done.

 

police shooting by race

 

Ask yourself: How broad is this data? How broad are the claims being made about it?

The Times report does explain the data that Fryer and company used in the new study. But it turns out it’s not nearly as broad a sample as the conclusion “these results undercut the idea that the police wield lethal force with racial bias” would suggest:

(Fryer) and a group of student researchers spent about 3,000 hours assembling detailed data from police reports in Houston; Austin, Tex.; Dallas; Los Angeles; Orlando, Fla.; Jacksonville, Fla.; and four other counties in Florida.

That’s 10 police departments in three states, with a majority of them based in major cities. That allowed Fryer and his team to compile a database with a lot of different shootings — about 1,330 over 16 years, from 2000 to 2015 — but not a lot of different police departments or institutional cultures.

When it comes to policing, this is especially important, because so many issues of crime and policing are local. Different cities have different approaches to police-community relations; different tensions; different standards for use of force. (In fact, the cities Fryer and his team worked with are all members of a White House initiative on policing data launched in 2015 — and the kind of department that thinks data collection and transparency are important is likely to have different priorities in other regards than one that isn’t.)

 

In comparison, the FBI’s Uniform Crime Report database includes records from thousands of police departments around the country. The number of shootings in the Fryer data set, spanning 16 years, is about equivalent to what the Uniform Crime Report compiles in two or three.

More importantly, the UCR report includes not just major cities but small towns and rural areas; not just diverse cities but less diverse ones; not just departments that think carefully about data collection but departments for which it’s just a needed chore to qualify for government funds.

 

Ask yourself: Is the question the study answers the same one the public is asking?

The most revealing passage in the Times article is probably the one explaining what Fryer and his team didn’t include in their study:

It focused on what happens when police encounters occur, not how often they happen. Racial differences in how often police-civilian interactions occur reflect greater structural problems in society.

In other words, Fryer and company found that there weren’t big racial disparities in how often black and white suspects who’d already been stopped by police were killed. But they deliberately avoided the question of whether black citizens are more likely to be stopped to begin with (they are) and whether they’re more likely to be stopped without cause (yup).

Avoiding those issues makes sense for the question Fryer was trying to answer. He wanted to know what happens between the moment a police officer stops someone and the moment he pulls the trigger — and how those sequences of events vary by race.

But when people talk about racial disparities in police use of force, they’re usually not asking, Is a black American stopped by police treated the same as a white American in the same circumstances? They’re making a broader critique of the “greater structural problems” in society in general and the criminal justice system in particular. They’re saying that black Americans are more likely to get stopped by police, which makes them more likely to get killed.

Eric Garner was killed in 2014 when police tried to arrest him for selling loose cigarettes. Philando Castile had been pulled over 52 times on misdemeanors (including for driving without a muffler and not wearing a seatbelt) before he was shot and killed last week. Michael Brown was stopped by Darren Wilson for walking in the middle of the street.

Maybe it’s possible (maybe) that those encounters would have been just as likely to escalate to the point of lethal force if each of those men had been white — but it kind of misses the point to say that, because if they’d been white, the encounters probably never would have happened.

Controlling for variables is an extremely important thing in social science. It allows you to figure out which factors actually matter and which ones don’t. In this case, Fryer and his team have given us suggestive evidence that among major-city police forces, police in tense situations are not unusually likely to shoot black suspects. They’ve made a valuable addition to the literature. But it’s just that: an addition, not a discovery, and not the last word.

Roland Fryer is wrong: There is racial bias in shootings by police

2020 update: The specific flaws of Roland Fryer’s paper have now been characterized in two studies (by other scholars, not myself). Knox, Lowe, and Mummolo (2019) reanalyze Fryer’s data to find it understates racial biases. Ross, Winterhalder, and McElreath (2018) do something similar through a statistical simulation.

 

Roland Fryer, an economics professor at Harvard University, recently published a working paper at NBER on the topic of racial bias in police use of force and police shootings. The paper gained substantial media attention – a write-up of it became the top viewed article on the New York Times website. The most notable part of the study was its finding that there was no evidence of racial bias in police shootings, which Fryer called “the most surprising result of [his] career”. In his analysis of shootings in Houston, Texas, black and Hispanic people were no more likely (and perhaps even less likely) to be shot relative to whites.

Fryer’s analysis is highly flawed, however. It suffers from major theoretical and methodological errors, and he has communicated the results to news media in a way that is misleading. While there have long been problems with the quality of police shootings data, there is still plenty of evidence to support a pattern of systematic, racially discriminatory use of force against black people in the United States.

Breaking down the analysis of police shootings in Houston

There should be no argument that black and Latino people in Houston are much more likely to be shot by police compared to whites. I looked at the same Houston police shooting dataset as Fryer for the years 2005-2015, which I supplemented with census data, and found that black people were over 5 times as likely to be shot relative to whites. Latinos were roughly twice as likely to be shot versus whites.

Fryer was not comparing rates of police shootings by race, however. Instead, his research asked whether these racial differences were the result of “racial bias” rather than merely “statistical discrimination”. Both terms have specific meanings in economics. Statistical discrimination occurs when an individual or institution treats people differently based on racial stereotypes that ‘truly’ reflect the average behavior of a racial group. For instance, if a city’s black drivers are 50% more likely to possess drugs than white drivers, and police officers are 50% more likely to pull over black drivers, economic theory would hold that this discriminatory policing is rational. If, however, police were to pull over black drivers at a rate that disproportionately exceeded their likelihood of drug possession, that would be an irrational behavior representing individual or institutional bias.

Once explained, it is possible to find the idea of “statistical discrimination” just as abhorrent as “racial bias”. One could point out that the drug laws police enforce were passed with racially discriminatory intent, that collectively punishing black people based on “average behavior” is wrong, or that – as a self-fulfilling prophecy – bias can turn into statistical discrimination (if black people’s cars are searched more thoroughly, for instance, it will appear that their rates of drug possession are higher). At the same time, studies assessing the extent of racial bias above and beyond statistical discrimination have been able to secure legal victories for civil rights. An analysis of stop-and-frisk data by Jeffrey Faganwhich found evidence racial bias, was an important part of the court case against the NYPD, and helped secure an injunction against the policy.

Even if one accepts the logic of statistical discrimination versus racial bias, it is an inappropriate choice for a study of police shootings. The method that Fryer employs has, for the most part, been used to study traffic stops and stop-and-frisk practices. In those cases, economic theory holds that police want to maximize the number of arrests for the possession of contraband (such as drugs or weapons) while expending the fewest resources. If they are acting in the most cost-efficient, rational manner, the officers may use racial stereotypes to increase the arrest rate per stop. This theory completely falls apart for police shootings, however, because officers are not trying to rationally maximize the number of shootings. The theory that is supposed to be informing Fryer’s choice of methods is therefore not applicable to this case. He seems somewhat aware of this issue. In his interview with the New York Times, he attributes his ‘surprising’ finding to an issue of “costs, legal and psychological” that happen following a shooting. In what is perhaps a case of cognitive dissonance, he seems to not have reflected on whether the question of cost renders his choice of methods invalid.

Economic theory aside, there is an even more fundamental problem with the Houston police shooting analysis. In a typical study, a researcher will start with a previously defined population where each individual is at risk of a particular outcome. For instance, a population of drivers stopped by police can have one of two outcomes: they can be arrested, or they can be sent on their way. Instead of following this standard approach, Fryer constructs a fictitious population of people who are shot by police and people who are arrested. The problem here is that these two groups (those shot and those arrested) are, in all likelihood, systematically different from one another in ways that cannot be controlled for statistically (UPenn Professor Uri Simonsohn expands on this point here). Fryer acknowledges this limitation in a brief footnote, but understates just how problematic it is. Properly interpreted, the actual result from Fryer’s analysis is that the racial disparity in arrest rates is larger than the racial disparity in police shootings. This is an unsurprising finding, and proves neither a lack of bias nor a lack of systematic discrimination.

Even if the difference in the arrest vs. shooting groups could be accounted for, Fryer tries to control for these differences using variables in police reports, such as if the suspect was described as ‘violently resisting arrest’. There is reason to believe that these police reports themselves are racially biased. An investigation of people charged with assaulting a police officer in Washington, DC found that this charge was applied disproportionately towards black residents even for situations in which no assault actually occurred. This was partly due to an overly broad definition of assault against police in DC law, but the principle – that police are likely to describe black civilians as more threatening – is applicable to other jurisdictions.

I’ll also briefly note that there was another analysis, using data from multiple cities, that looked at racial differences in whether or not civilians attacked officers before they were shot. Fryer himself downplays the credibility of this analysis, because it relied on reports from police who had every incentive to misrepresent the order of events.

Racial inequality in police shootings

Fryer’s study is far from the first to investigate racial bias or discrimination in police shootings. A number of studies have placed officers in shooting simulators, and most have shown a greater propensity for shooting black civilians relative to whites. Other research has found that cities with black mayors and city councilors have lower rates of police shootings than would otherwise be expected. A recent analysis of national data showed wide variation in racial disparities for police shooting rates between counties, and these differences were not associated with racial differences in crime rates. This is just a small sample of the dozens of studies on police killings published since the 1950s, most of which suggests that racial bias is indeed a problem.

It is a failure of journalism that the New York Times heavily promoted this study without seeking critical perspectives from experts in the field. Fryer makes basic methodological errors, overstates the quality of his results, and casually uses the term “racial bias” in a way that is nearly guaranteed to be misinterpreted by anyone who isn’t an economist.