“Shock and Awe” data storytelling
Smart use of comparison, conversion, and contextualization when sharing research findings are essential to effective data storytelling. Instead of listing results like a lab report or a phone book of numbers, using principles of shock and awe data storytelling help our data come to live. Doing so helps us be more effective in explaining data findings to diverse audiences, help our audiences better understand the scale and scope of our findings, and highlight actionable imperatives and implications.
1. Contextualize large numbers by comparing to known places.
When reporting aggregate counts of people, events, engagements, or likes, totals often sum in the millions. To contextualize this, use references to known places such as cities or states with populations in the millions. When counts are not of the scale of metropolitan areas or states, it can be useful to reference well-known public spaces such as museums, performing arts arenas, or sports stadiums. Pick places most relevant to your audience; for example, don’t compare engagements with an online art exhibit to Lincoln Center to general public audiences, but perhaps do so if the client is in the New York metropolitan area or in the arts. Doing this effectively helps audiences understand the magnitude of findings and appropriately frame the size of the problem or opportunity.
To contextualize the number of Blacks arrested each year (2,115,381), I compare that total population to major US cities. This benchmark reveals that the combined population would be the fifth largest city in the United States, just behind Houston and larger than Philadelphia, Phoenix, and San Antonio.
2. Contextualize large numbers by comparing to known events.
When reporting data trends over time, we often compare current period findings to findings from previous periods. Or we can reference findings with known historical events either in the same domain (a previous quarter of historical growth) or well-known historical events. When doing the latter, we can emphasize the social implications of our findings to compel action, especially if the reference point is a catastrophic event.
To contextualize the number of Blacks reported killed in 2018 (7,407), I compare it to the well-known reference point of 9/11. This comparison shows us that each year, more Black Americans are murdered than double the number of fatalities of the September 11th terrorist attacks.
3. Convert counts into frequency of occurrence.
Another way to contextualize large counts is to convert to rate of occurrence so audiences understand how frequently an event or action happens. By converting sums into a common frame of reference, time, we show the pervasiveness of findings and engender urgency in the imperative to act. This is also useful in framing key findings since frequency of events are often more memorable than the total count of events.
To contextualize the number of Blacks arrested each year (2,115,381) and murdered (7,407), I calculate how often these events happen. Dividing annual counts to show frequency by day and time, I find 5,796 Blacks are arrested and 20 murdered daily, or arrested every 15 seconds and murdered every 71 minutes.
4. Compare likelihood of occurrence across key audiences.
Often, the best reference point for comparison is another audience or subset within the data we have. This is most useful when there is a pair of clients, competitors, or demographic groups that are already natural comparison groups. This type of comparison is especially useful to highlight gaps, disparities, and inequalities within the data. It is also effective in highlighting the key differentiating metric, likelihood of occurrence, since comparing total counts can sometimes be non-informative. By choosing the right denominator or possible universe of possibility, calculating and comparing risk likelihood enables comparisons that account for potential confounders or factors of difference between different comparison groups.
When comparing treatment of Black and White Americans by police, looking at just total counts of people arrested provide incomplete insights. It is a numerical fact that more White Americans are arrested each year than Black Americans (5.3 million versus 2.1 million). However, that is not evidence that there is no unequal treatment since there are significantly more White Americans (67% of US population) than Black Americans (13%). But when we compare likelihoods of being arrested given population, the data shows that Blacks are over two times (2.3x) more likely to be arrested than White Americans. Similarly, while more White Americans are killed by police than Black Americans (454 versus 229 in 2018), Black Americans are nearly three times (2.6x) more likely to be killed by police than White Americans when factoring for number of police initiated contacts.
5. Convert risk into economic costs.
Another way to compare data is to show the economic costs of outcomes or behaviors. By converting rate of events into dollar amount costs, it highlights real-world effects as well as providing a common metric for comparison. This is helpful to not only generate mediagenic headlines, but also in factoring for other dimensions of difference when comparing relative likelihoods across comparison groups.
Economic cost analysis further highlights the disparities in criminal (in)justice between Blacks and Whites. Through extrapolation of expected earnings based on the Value of Statistical Life estimate, median income by race, and population size, I find that the economic cost of murdered Blacks, $741 billion, is equivalent to 41% of annual Black income. And when factoring for disparities in wealth between Blacks and Whites, I find that the economic cost of homicide on Black communities is more than 11 times (11.4x) greater than in White communities.
6. Create a counterfactual.
Another way to illustrate relative risk and contextualize findings is to calculate a counterfactual. A counterfactual is a data-driven model of what could be to help us understand what is. By showing an empirical estimate of “what if” scenarios, for example, if the likelihood of occurrence was flipped between two comparison groups or a campaign was effective in increasing engagement by a certain percentage, the counterfactual helps us further highlight empirical gaps or potential based on sound extrapolation from currently observed and measured outcomes.
In the context of racial disparities in incarceration rates, I calculated expected numbers of Whites and Blacks who would be in prison if their relative risk of incarceration was switched. I find that 294,010 fewer Blacks would be incarcerated if sentenced at the rate of Whites. Conversely, an additional 739,362 Whites would be incarcerated if sentenced at the rate of Blacks.
7. Combine multiple comparisons, conversions, and contextualization.
Finally, combine multiple shock and awe principles in presenting our data. Doing so effectively allows us to efficiently show both current status, gaps between key comparison groups, and contextualize magnitude of differences to highlight actionable imperatives. Mastering when to do what type of combinations will be make our storytelling more effective and powerful.
One example of this is in how I describe racial disparities in arrest after initial police contact. I create a counterfactual based on relative risk calculations and contextualize by reference to known places to conclude that if black Americans were arrested at the white rate, over 1.1 million fewer Black Americans would be arrested each year or fill all 29 NBA stadiums twice over. Conversely, if White Americans were arrested at the black rate, over 5.7 million more White Americans would be arrested, which is equivalent to the population of Colorado.