Technical writing

CDC Injury Mortality: The Federal Record of How Americans Die from Firearms, Overdoses, and Crashes

· 12 min read· AI Analytics
CDCNCHSInjury MortalityPublic HealthFederal Data

Every injury death in the United States begins as a death certificate—filed in a county, registered by a state, and forwarded to the federal government, where a nosologist or an automated coding system reads the cause of death and assigns it an ICD-10 code. The CDC's National Center for Health Statistics gathers all of them into the National Vital Statistics System, and from that record it carves out the subset that comes from external causes: the firearms, the overdoses, the crashes, the falls, the suffocations. The result is the country's definitive ledger of how Americans die from injury— roughly 74,000 mortality-rate records, each fixing a mechanism, an intent, a population, and a place against a rate per 100,000.

This article covers what the injury-mortality dataset is and how the National Vital Statistics System and ICD-10 frame it; the two axes—mechanism and intent—along which every injury death is classified, and how that framework underpins the CDC's WISQARS query system; age adjustment, and why it is the only honest way to compare rates across populations and over time; the defining American injury trends the data documents—the drug-overdose epidemic from opioids to illicitly manufactured fentanyl, firearm deaths and the fact that most are suicides, rising suicide rates, and motor-vehicle deaths; the demographic and geographic strata—age, sex, race and ethnicity, and state—by which the rates are reported; how the dataset is published through CDC WONDER and the NCHS datasets on data.cdc.gov; a Python workflow that pulls injury-mortality rows, trends an age-adjusted rate over time, and ranks states by a chosen mechanism; and the caveats—small-count suppression, provisional recent years, intent misclassification, and the limits of the death certificate—that every analyst must internalize before drawing conclusions.

What the dataset is

The National Vital Statistics System (NVSS) is the oldest and most complete federal data system there is: a continuous, near-census record of every birth and every death in the United States, assembled by the National Center for Health Statistics from the vital records that the states and territories are legally responsible for registering. The death side of NVSS rests on the death certificate, a standardized document completed for every death, on which a physician, medical examiner, or coroner records the cause of death. NCHS processes those certificates into a coded mortality file in which each death carries an underlying cause of death—the single condition or event that initiated the chain leading to death. Injury mortality is the slice of that file whose underlying cause is an external cause: not a disease, but an injury or poisoning event. Surfaced through CDC data services, the injury-mortality rate record we store comprises roughly 74,000 rows.

In our database this record is stored as the table cdc_injury_mortality, with the grain of one row per mechanism by intent by demographic stratum by geography by year. A single mechanism—firearms, say—split across five intents, several demographic strata, and fifty-one geographies over a decade of years generates a large family of rows, each carrying its own death count, its own population denominator, and its own crude and age-adjusted rates. The columns capture what kind of injury, with what intent, to whom, where, when, and at what rate:

year                  -- calendar year of death (final or provisional)
mechanism             -- firearm, drug poisoning, motor vehicle, fall,
                         suffocation, drowning, fire/burn, cut/pierce, ...
intent                -- unintentional, suicide, homicide,
                         undetermined, legal intervention/war
geography             -- United States, or a state / jurisdiction
age_group             -- demographic stratum: age band (or "all ages")
sex                   -- male, female, or all
race_ethnicity        -- race and Hispanic-origin stratum (or "all")
deaths                -- count of deaths in the cell (suppressed if small)
population            -- denominator: population at risk for the cell
crude_rate            -- deaths per 100,000 population, unadjusted
age_adjusted_rate     -- rate standardized to the US standard population
rate_flag             -- suppression / unreliable / provisional indicator

The mechanism and intent columns are the load-bearing pair, and the section that follows is devoted to them. The age_adjusted_rate is the column most analyses actually use, because—as the age-adjustment section explains—it is the only one that can be compared honestly across a young state and an old state, or across an aging decade. The deaths and population columns are the raw materials from which the rates are computed, and the rate_flag is the column an analyst ignores at their peril: it marks the cells where the death count is too small to publish, or too small to yield a statistically reliable rate, or where the year is still provisional. The geography and demographic columns—geography, age_group,sex, and race_ethnicity—are the strata along which the same mechanism-and-intent combination is sliced, and they are what turn a single national number into a map of who is dying, where, and at what age.

The National Vital Statistics System and ICD-10

Understanding the injury-mortality data requires understanding how a death becomes a coded record. When a person dies, a death certificate is completed by the certifier—a treating physician for most natural deaths, and a medical examiner or coroner for deaths from injury, poisoning, violence, or other external causes, which by law fall to the medicolegal death investigation system. The certifier records the immediate cause, the sequence of conditions leading to it, and, critically for injury deaths, a description of how the injury occurred. The state registers the certificate and transmits the data to NCHS, which assigns the codes and compiles the national file. Because registration is essentially universal, NVSS is a near-complete census of deaths rather than a sample—its great strength relative to survey-based health data.

The coding system is the International Classification of Diseases, Tenth Revision (ICD-10), the World Health Organization standard the United States has used for cause-of-death coding since 1999. ICD-10 is what makes the data comparable across the country and, broadly, across nations. For injury deaths the relevant codes are the external cause codes—the V, W, X, and Y blocks that describe transport accidents, falls, exposure to forces, accidental poisoning, intentional self-harm, assault, and events of undetermined intent. These external cause codes are dual in nature: they encode both how the injury happened (the mechanism) and why, in the sense of intent. A death coded to assault by discharge of a firearm and a death coded to intentional self-harm by discharge of a firearm share the same mechanism but carry opposite intents—and that duality is exactly what the mechanism-by-intent framework formalizes.

The 1999 switch from the ninth revision (ICD-9) to ICD-10 introduced a discontinuity that still matters for long-run analysis. The two revisions partition causes of death differently, so a rate trend that crosses 1999 is not a clean apples-to-apples series; NCHS publishes comparability ratios to help bridge the break, but any decades-spanning trend must account for it. Within the ICD-10 era the coding has been more stable, though periodic refinements and changes in certification practice still introduce smaller seams—a reminder that the data is not a neutral recording of nature but the product of a coding system that evolves.

The mechanism-by-intent framework

Injury mortality is organized along two axes, and holding them distinct is the single most important conceptual move in working with the data. The first axis is mechanism—the physical means by which the fatal injury occurred. The standard mechanism categories include firearm, poisoning and drug overdose (the category that captures the overdose epidemic), motor vehicle and broader transport, falls, suffocation (which includes hanging and strangulation), drowning, and fire and burns, alongside cut/pierce, struck by/against, machinery, and other categories. Mechanism answers the question “what killed them?”

The second axis is intent—the manner of death, the answer to “was it an accident, a suicide, a homicide?” The standard intent categories are unintentional (accidental), suicide (intentional self-harm), homicide (assault), undetermined (where the investigation could not establish whether the death was accidental or intentional), and legal intervention and operations of war (deaths caused by law-enforcement action or military operations). Every injury death sits at the intersection of one mechanism and one intent—a firearm suicide, an unintentional drug poisoning, a homicide by firearm, an unintentional fall—and the dataset is built as a cross-tabulation of the two.

This is precisely the framework behind WISQARS, the CDC's Web-based Injury Statistics Query and Reporting System, the agency's public interface for injury data. WISQARS lets a user request fatal-injury counts and rates by any combination of mechanism, intent, demographic stratum, and geography—and the table we store is the same mechanism-by-intent-by-stratum cross-tabulation that WISQARS serves, drawn from the same NVSS underlying-cause-of-death file. The reason the two-axis structure matters so much analytically is that mechanism and intent must never be conflated. “Firearm deaths” is a mechanism total that silently combines suicides, homicides, and accidents; collapsing it into a single number obscures that, nationally, the majority of firearm deaths are suicides rather than homicides—a fact with very different policy implications than the homicide-only framing the phrase often connotes. Likewise, “overdose deaths” spans unintentional poisonings, suicides, and undetermined cases. The framework exists so that an analyst is forced to specify both axes and is never misled by a mechanism total that hides its intent composition.

Age adjustment and rates

A raw count of deaths tells you how many people died; it tells you almost nothing about risk until it is turned into a rate, and even a rate can mislead until it is age-adjusted. This is the second indispensable concept in the data, and the dataset carries both forms of rate for exactly this reason.

The crude rate is the simple one: deaths in a population divided by that population, expressed per 100,000 people. It answers “how many per hundred thousand actually died this year in this place?” The problem is that the risk of dying from most injuries depends heavily on age—falls kill the old, overdoses concentrate in midlife adults, motor-vehicle crashes weigh on the young—so the crude rate of any place is partly an artifact of its age structure. A state with a large retirement population will post a high crude fall-death rate not because its environment is more dangerous but because it is older. Comparing crude rates across a young state and an old state, or across an aging nation over time, conflates the thing you care about (the risk at a given age) with the thing you do not (the age mix of the population).

The age-adjusted rate solves this. It computes the death rate within each age group and then re-weights those age-specific rates to a fixed standard population—in US practice, the year 2000 standard population—so that every place and every year is expressed as if it had the same age distribution. What remains after the age structure is held constant is the part of the difference that reflects real differences in risk. Age-adjusted rates are therefore the right tool for almost every comparison the data invites: ranking states, charting a national trend across an aging population, or comparing demographic groups with different age profiles. The discipline is simple but absolute: compare crude rates only within a single age-homogeneous slice, and use age-adjusted rates for everything else. An analysis that ranks states on crude injury rates is, to a substantial degree, ranking them on how old their residents are.

The defining trends: overdose, firearms, suicide, and crashes

The injury-mortality data is the federal evidentiary record behind the major public-health stories of the last quarter century. Four mechanism-and-intent trends dominate it.

The drug-overdose epidemic is the most consequential. Recorded under the poisoning and drug-overdose mechanism—overwhelmingly with unintentional intent—the age-adjusted overdose death rate climbed through the 2000s and 2010s in successive, identifiable waves: first a wave driven by prescription opioids, then a wave driven by heroin as supply shifted, and then an accelerating wave driven by illicitly manufactured fentanyl and other synthetic opioids, frequently in combination with stimulants such as methamphetamine and cocaine. The synthetic-opioid wave pushed annual overdose deaths to levels without precedent in the country's vital records, making drug poisoning one of the leading causes of death for working-age adults. The data's mechanism-and-intent detail, joined to the multiple-cause-of-death fields that name the specific drugs involved, is what lets researchers separate these waves and attribute the rise to particular substances.

Firearm deaths are the trend most often misunderstood, and the data corrects the misunderstanding directly. Firearms are a mechanism that spans intents, and—a point worth stating plainly because it is so frequently lost—the majority of US firearm deaths are suicides, not homicides. Firearm suicide and firearm homicide are distinct phenomena with different demographics, different geographies, and different drivers, and the mechanism-by-intent structure is what keeps them distinct. Both components rose over recent years, and firearm injury became the leading cause of death among US children and adolescents, a milestone the data documents through the age-stratified rows. Any firearm-mortality analysis that does not split by intent is, almost by definition, answering the wrong question.

Suicide, as an intent that cuts across mechanisms—firearm, suffocation, and poisoning chief among them—rose substantially over the first two decades of the ICD-10 era, with the age-adjusted suicide rate climbing to among its highest levels in modern records before showing more recent year-to-year variation. The data's value here is in the disaggregation: suicide rates differ enormously by age, sex (with much higher rates among males), race and ethnicity, and geography, and the strata are what reveal the populations at greatest risk. Finally, motor-vehicle deaths—recorded under the transport mechanism, predominantly unintentional—tell a longer-arc story: a decades-long decline in the age-adjusted rate driven by safer vehicles, seat belts, airbags, and graduated licensing, interrupted by a notable rise in the early 2020s tied to changes in driving behavior. Together these four trends are why the injury-mortality file is among the most analyzed datasets the federal government publishes.

Demographic and geographic strata

A national rate is a starting point, never an endpoint. The analytic power of the injury-mortality data lies in its stratification: the same mechanism-and-intent combination is reported across demographic and geographic cells, and it is the strata that turn a single number into an account of disparity.

The demographic axes are age, sex, and race and ethnicity. Age is the most fundamental, because the age pattern of an injury is often its signature—the bimodal age curve of firearm deaths, the old-age concentration of fatal falls, the midlife peak of overdose mortality, the young-adult skew of motor-vehicle and homicide deaths. Sex divides nearly every injury cause sharply: men die from injury at far higher rates than women across almost all mechanisms, with suicide and homicide especially male-skewed. Race and ethnicity exposes some of the starkest disparities in all of US health data—firearm homicide concentrated among young Black men, the historically high and recently rising injury and overdose mortality among American Indian and Alaska Native populations, and shifting overdose disparities as the epidemic's demographics changed. Reporting rates by these strata is how the data moves from describing a national average to identifying who bears the burden.

The geographic axis is principally the state, with the national total alongside it; finer geographies (county, urbanization category) are available in the underlying NVSS files but are more heavily suppressed because the counts thin out. State-level age-adjusted rates reveal pronounced regional patterns—higher firearm mortality in parts of the South and Mountain West, overdose mortality once concentrated in Appalachia and now diffused, motor-vehicle death rates that track rurality and road type—and they are the geography most analyses rank on. Crucially, the demographic and geographic strata interact, and the most important findings often live in the interaction: not “the suicide rate” but the suicide rate for a specific age-sex group in a specific kind of place. The dataset's grain—one row per mechanism by intent by demographic by geography by year—is built precisely to support those interaction queries.

Analytical uses

A near-complete, ICD-10-coded, age-adjusted, stratified record of injury death supports a distinctive range of analysis that no survey or partial dataset can.

Trending a mechanism over time is the most immediate use. Because each row carries a year and an age-adjusted rate, an analyst can chart the national overdose, firearm, suicide, or motor-vehicle rate across the ICD-10 era, decompose it by intent to separate suicides from homicides or unintentional from intentional poisonings, and identify the inflection points—the onset of the fentanyl wave, the pandemic-era rise in homicide and motor-vehicle deaths—that define the public-health timeline. The age adjustment is what makes the trend trustworthy across an aging population.

Ranking and mapping states exploits the geographic axis. Ranking states by a chosen mechanism's age-adjusted rate—firearm, overdose, motor vehicle—surfaces the regional structure of injury risk and supports the kind of state-comparison analysis that informs policy debate, provided the comparison uses age-adjusted rather than crude rates and respects the suppression of small-count states. Disparity analysis brings the demographic strata to bear, quantifying how injury mortality differs by race and ethnicity, sex, and age, and how those gaps have widened or narrowed—the empirical backbone of health-equity work. And joining to other federal dataextends the reach: linking overdose mortality to SAMHSA treatment-capacity data to ask whether deaths fall where treatment is available, or pairing motor-vehicle mortality with NHTSA crash and vehicle-safety data to connect outcomes to vehicles and roads. The injury-mortality file is the outcome variable—the deaths—that those upstream datasets help explain.

Python workflow: injury mortality from CDC data services

The script below pulls injury-mortality rows from the NCHS datasets on data.cdc.gov— the Socrata API that ships the tidy, pre-computed crude and age-adjusted rates—and computes the three core analyses: trending a national age-adjusted rate over time (here the drug-overdose rate), ranking states by a chosen mechanism's age-adjusted rate (here firearms), and confirming from the data that suicides make up the majority of firearm deaths by tabulating the firearm mechanism by intent. The same underlying file is also queryable through CDC WONDER for custom cross-tabulations of the underlying-cause-of-death data. No API key is required for modest volumes. Because NCHS Socrata dataset identifiers and field names vary between releases—and because the national injury-mortality resource is all-states while a state-resolved release is needed for the ranking—the script isolates the dataset id in one place and resolves the mechanism, intent, geography, and rate column names defensively rather than hard-coding them; any production use should be validated against the current data.cdc.gov catalog and should honor the suppression flags discussed below.

import requests
import pandas as pd

# CDC injury mortality is published two ways, both used here:
#   1. data.cdc.gov (Socrata) -- tidy NCHS injury-mortality datasets,
#      one row per mechanism x intent x demographic x geography x year,
#      with death counts and crude / age-adjusted rates per 100,000.
#   2. CDC WONDER -- the query system behind the underlying-cause-of-
#      death files, queried by an XML request that returns the same
#      ICD-10 external-cause deaths for custom cross-tabulations.
# This script works from the Socrata API: it needs no key for modest
# volumes, returns JSON, and ships the pre-computed age-adjusted rates.
SODA = "https://data.cdc.gov/resource"

# Dataset identifiers (the 4x4 Socrata IDs) change across NCHS releases;
# isolate them here and confirm against the current data.cdc.gov catalog.
# nt65-c7a7 is the NCHS "Injury Mortality: United States" resource.
INJURY_DATASET = "nt65-c7a7"   # NCHS injury-mortality resource id


def fetch(dataset, where=None, select=None, limit=50000):
    # Socrata accepts SoQL query parameters ($where, $select, $limit).
    params = {"$limit": limit}
    if where:
        params["$where"] = where
    if select:
        params["$select"] = select
    url = f"{SODA}/{dataset}.json"
    r = requests.get(url, params=params, timeout=120)
    r.raise_for_status()
    return pd.DataFrame(r.json())


def _num(df, *names):
    # Return the first column present, coerced to float. NCHS field
    # names vary by release (e.g. age_adjusted_rate vs aa_rate).
    for n in names:
        if n in df.columns:
            return pd.to_numeric(df[n], errors="coerce")
    return pd.Series(dtype=float)


def _col(df, *names):
    # Resolve the first matching column name actually present.
    for n in names:
        if n in df.columns:
            return n
    return None


# --- 1. Trend the national age-adjusted rate for one mechanism --------
# Pull all-intent rows for drug overdose (mechanism = poisoning / drug
# overdose) and order by year. Field names resolve defensively because
# they differ across NCHS releases (mechanism vs injury_mechanism, etc.).
def national_trend(mechanism="Drug poisoning"):
    df = fetch(INJURY_DATASET)
    if df.empty:
        print("No rows returned.")
        return df
    mech_col = _col(df, "mechanism", "injury_mechanism")
    if mech_col:
        df = df[df[mech_col].astype(str).str.contains(mechanism, case=False,
                                                      na=False)]
    df["year"] = pd.to_numeric(df.get("year"), errors="coerce")
    df["aa_rate"] = _num(df, "age_adjusted_rate", "aa_rate", "rate")
    trend = (df.dropna(subset=["year", "aa_rate"])
               .groupby("year")["aa_rate"].mean()
               .sort_index())
    print(f"National age-adjusted rate per 100,000 -- {mechanism}:")
    for yr, rate in trend.items():
        print(f"  {int(yr)}  {rate:6.1f}")
    return trend


# --- 2. Rank states by a chosen mechanism's age-adjusted rate ---------
# Requires a state-resolved release that carries a geography/state
# column; the national resource above is all-states, so point this at a
# state-level NCHS injury-mortality dataset id when one is available.
def rank_states(mechanism="Firearm", year=None):
    df = fetch(INJURY_DATASET)
    if df.empty:
        print("No rows returned.")
        return df
    geo_col = _col(df, "geography", "state", "state_name", "jurisdiction")
    if not geo_col:
        print("No geography column in this release; use a state-level id.")
        return df
    mech_col = _col(df, "mechanism", "injury_mechanism")
    if mech_col:
        df = df[df[mech_col].astype(str).str.contains(mechanism, case=False,
                                                      na=False)]
    if year:
        df = df[pd.to_numeric(df.get("year"), errors="coerce") == int(year)]
    # Keep state-level rows only; drop the national aggregate.
    df = df[df[geo_col].astype(str).str.lower() != "united states"]
    df["aa_rate"] = _num(df, "age_adjusted_rate", "aa_rate", "rate")
    ranked = (df.dropna(subset=["aa_rate"])
                .groupby(geo_col)["aa_rate"].mean()
                .sort_values(ascending=False))
    print(f"\nStates ranked by {mechanism} age-adjusted rate (top 15):")
    for state, rate in ranked.head(15).items():
        print(f"  {str(state)[:24]:<24} {rate:6.1f}")
    return ranked


# --- 3. Suicide share of firearm deaths -------------------------------
# A majority of US firearm deaths are suicides; confirm from the data
# by comparing intent-specific counts within the firearm mechanism.
def firearm_intent_mix():
    df = fetch(INJURY_DATASET)
    if df.empty:
        return
    mech_col = _col(df, "mechanism", "injury_mechanism")
    intent_col = _col(df, "intent", "injury_intent")
    if not (mech_col and intent_col):
        print("Mechanism/intent columns not found in this release.")
        return
    df = df[df[mech_col].astype(str).str.contains("Firearm", case=False,
                                                  na=False)]
    df["deaths"] = _num(df, "deaths", "death_count", "estimate")
    mix = df.groupby(intent_col)["deaths"].sum().sort_values(ascending=False)
    total = mix.sum()
    print("\nFirearm deaths by intent (national, all years pulled):")
    for intent, n in mix.items():
        share = n / total if total else 0
        print(f"  {str(intent)[:18]:<18} {int(n):>9,}  ({share:.1%})")


national_trend("Drug poisoning")
rank_states("Firearm")
firearm_intent_mix()

Two practical notes apply. First, the script reads the pre-computed age-adjusted rates that NCHS publishes rather than re-deriving them, which is the right default—reproducing age adjustment correctly requires the age-specific death counts, the age-specific population denominators, and the exact standard-population weights, and the published rate already encodes all three. If an analysis needs a non-standard adjustment (a different age grouping, or a custom standard), it must pull the age-specific deaths and populations and re-weight them itself. Second, for serious work the script must respect suppression: cells with very small death counts are withheld for confidentiality and rates based on small numbers are flagged as statistically unreliable, so any aggregation, ranking, or trend must drop or specially handle the suppressed and flagged cells rather than treating a blank as a zero. The rate_flag column is there for exactly this, and ignoring it silently biases every downstream number.

Limitations and analytical caveats

The injury-mortality data is the most authoritative record of fatal injury in the United States, but it carries structural limitations that an analyst must internalize before drawing conclusions from it.

Small counts are suppressed, and small-count rates are unreliable. To protect confidentiality, NCHS suppresses death counts below a threshold, so cells for rare mechanism-and-intent combinations in small demographic or geographic strata simply do not appear. And even where a count is published, a rate computed from a small number of deaths is statistically unstable—NCHS flags rates based on fewer than a threshold number of deaths as unreliable. The practical consequences are twofold: a blank cell is not a zero (the deaths happened; they are merely withheld), and a volatile-looking rate in a small state or a narrow stratum may be noise rather than signal. Treating suppressed cells as zeros, or ranking on unstable small-count rates, produces confidently wrong conclusions precisely where the data is thinnest.

Recent years are provisional. Final mortality data is released after a substantial lag—deaths must be registered, certified causes resolved (which for injury deaths often awaits the completion of a medical-examiner investigation, toxicology, or an inquest), coded, and reconciled—so the most recent periods are published only as provisional estimates that are revised upward as late-arriving certificates are processed. Provisional counts for the latest months systematically undercount, especially for drug-overdose and other deaths that depend on toxicology and investigation. Any analysis of the leading edge of a trend must use provisional data with explicit caution and expect upward revision; the dataset is authoritative for established years and multi-year trends, not for pinning down last quarter's number.

Intent can be misclassified, and the undetermined category is real. Deciding whether an injury death was unintentional, a suicide, or a homicide is a judgment made by a certifier—a medical examiner or coroner—under varying resources, standards, and local practice, and that judgment is not always certain. Some deaths are coded undetermined because the evidence could not establish intent; others may be misclassified, and there is well-documented concern that some suicides, particularly some poisoning and overdose deaths, are recorded as unintentional or undetermined rather than as suicide. Because intent classification varies across jurisdictions and over time, intent-specific comparisons across states—and especially the boundary between unintentional, suicide, and undetermined—must be made with the understanding that some of the apparent difference reflects certification practice rather than reality. The undetermined category should be examined alongside the intents it sits between, not discarded.

The death certificate is the unit, with its strengths and its blind spots. NVSS is a near-complete census, which is its great virtue—but it records only deaths, not the far larger universe of nonfatal injuries, and it captures only what the certificate captures. Race and ethnicity on death certificates can be misreported relative to self-identification, biasing rates for some groups—particularly American Indian and Alaska Native populations, whose injury and overdose mortality is understood to be undercounted by certificate misclassification. Geographic detail finer than the state thins quickly into suppression. And the single underlying-cause framing compresses a death into one mechanism and one intent, even when the multiple-cause fields—essential for naming the specific drugs in an overdose—tell a richer story. Held with these caveats in mind, the cdc_injury_mortality table remains the definitive federal record of how Americans die from injury: a near-complete, ICD-10-coded, age-adjusted, stratified ledger of firearms, overdoses, crashes, falls, and the rest of the external causes—the outcome the country's entire injury-prevention enterprise exists to bend downward.

Related writing

CDC Nutrition, Physical Activity, and Obesity: The Federal Surveillance Record of American Health Behavior — The companion CDC surveillance system on the chronic-disease side: where injury mortality records how Americans die from external causes, the NPAO behavioral data records the diet and activity patterns that drive the leading natural causes of death, both reported as stratified rates by state and demographic.

SAMHSA Treatment Data: The Federal Database Behind Substance Abuse and Mental Health Program Statistics — The upstream counterpart to overdose mortality: SAMHSA's treatment-capacity and admissions data describes the system meant to prevent the deaths the injury-mortality file counts, and joining the two asks whether overdose deaths fall where treatment is available.

NHTSA Vehicle Safety Complaints: The Federal Database Behind Auto Defect Investigations and Recalls — The vehicle-side complement to motor-vehicle mortality: NHTSA's defect, complaint, and recall data describes the vehicles and safety failures behind crashes, while the injury-mortality file records the fatal outcomes those crashes produce on the road.