NHTSA Defect Investigations: The Federal Record of What Leads to a Recall

Before there is a recall, there is an investigation. A pattern of complaints starts to cluster around the same component, the same model years, the same failure—a steering shaft that loses assist, an airbag that ruptures, a tire that sheds its tread—and a small federal office in Washington opens a file. NHTSA's Office of Defects Investigation works that space between a signal and a remedy: roughly 5,300 defect investigations, each keyed to an action number, each a record of how a potential safety defect was found, escalated, and either forced into a recall or closed without one. It is the upstream half of the recall record—the account of how the government decides, case by case, that something on the road is dangerous enough to call back.

This article covers what the defect-investigations dataset is and how the Office of Defects Investigation fits inside NHTSA; the statutory frame—the National Traffic and Motor Vehicle Safety Act and the TREAD Act—that gives the agency its defect authority; the escalation ladder by which an inquiry moves from a Preliminary Evaluation to an Engineering Analysis and on to a recall, a closure, or a formal recall-request letter; the landmark cases—Firestone and Ford, Toyota unintended acceleration, the GM ignition switch, and the Takata airbag inflator, the largest auto recall in history—that the investigation record traces from first signal to mass remedy; how the investigation table joins by action number and recall campaign number to the complaint and recall datasets that sit on either side of it; the analyses the data supports, from manufacturer and component patterns to investigation duration and recall-conversion rates; a Python workflow that pulls the ODI flat files, tallies investigations by type and component, and links them to recalls; and the caveats—a discretionary, signal-driven, summary-coded record—that every analyst must keep in mind.

What the dataset is

The Office of Defects Investigation (ODI) is the unit inside the National Highway Traffic Safety Administration that investigates potential safety defects in motor vehicles and motor-vehicle equipment. When ODI opens an inquiry, it assigns it an action number—a short code such as PE24-001 or EA22-004 whose prefix records the type of investigation and whose suffix records the year and sequence—and it tracks that inquiry through to a resolution. The defect- investigations dataset is the record of those inquiries: roughly 5,300 investigations, one row per action number, capturing what was investigated, who made it, when the file opened and closed, and—crucially—whether it ended in a recall.

In our database this record is stored as the table nhtsa_investigations, with the grain of one row per investigation. It is the connective tissue of the NHTSA safety data: it sits directly downstream of the complaint record—the consumer-reported failures that often supply the signal that opens an investigation—and directly upstream of the recall record, because a confirmed defect investigation is one of the main ways a recall comes to be. The columns identify the subject vehicle or equipment, the manufacturer, the component at issue, the dates, and the recall campaign number assigned if the investigation produced a recall:

nhtsa_action_number  -- the investigation ID; prefix encodes the type
                        (PE = Preliminary Evaluation, EA = Engineering
                         Analysis, RQ = Recall Query, DP = Defect Petition)
make                 -- vehicle / equipment make under investigation
model                -- model under investigation
year                 -- model year(s) covered
component            -- the system or part being investigated
mfr_name             -- manufacturer named in the investigation
open_date            -- date ODI opened the investigation
close_date           -- date ODI closed the investigation
campaign_number      -- recall campaign number, if the inquiry led to a
                        recall (the join key to the recall dataset)
subject              -- short subject line of the investigation
summary              -- narrative summary of the alleged defect and findings

Two columns do the heaviest analytical work. The nhtsa_action_number is the primary key, and its two-letter prefix is itself a data point: it tells you the investigation type and therefore where on the escalation ladder the inquiry sits. The campaign_number is the bridge to the rest of the safety record—a non-empty recall campaign number means the investigation ended in a recall and can be joined straight to the recall dataset, where the affected-population count, the remedy, and the completion rate live. Between them, the open_date and close_datemeasure how long the agency took, the component field names the system at fault, and the summary carries the narrative the coded fields compress. Together they turn the table from a list of file numbers into a traceable account of how each potential defect was worked.

What it is and the statutory frame

NHTSA's defect authority rests on the National Traffic and Motor Vehicle Safety Act of 1966, the law that created the federal motor-vehicle safety program after a decade of rising road deaths and the public reckoning over automotive safety that surrounded it. The Safety Act did two things that frame this dataset. It authorized the agency to issue the Federal Motor Vehicle Safety Standards—the binding rules that govern how vehicles must be built—and, separately, it gave the agency the power to identify safety-related defects in vehicles and equipment already on the road and to compel the responsible manufacturer to remedy them. A safety defect, in the statute's terms, is a problem that exists in a group of vehicles or items of equipment and that poses an unreasonable risk to safety. The defect-investigation process is how the agency exercises that second power: it is the machinery for finding the defects the standards did not prevent.

The other foundational statute is the Transportation Recall Enhancement, Accountability, and Documentation (TREAD) Act of 2000, passed in the direct aftermath of the Firestone tire failures on Ford Explorers. The TREAD Act's central innovation was Early Warning Reporting: it required manufacturers to submit to NHTSA, on a regular schedule, data on death and injury claims, warranty claims, field reports, and consumer complaints—the kind of internal information that, in the Firestone episode, had revealed the problem to the manufacturer long before the public or the agency knew of it. Early Warning Reporting gave ODI a second, structured stream of defect signals to set alongside consumer complaints, and it raised the penalties for manufacturers that concealed known defects. The investigations dataset is, in a real sense, the visible output of that signal-detection apparatus: it records the cases the agency chose to pursue out of the complaint and early-warning streams flowing into it.

The defect process is also distinct from the standards-compliance process, and the distinction matters for reading the data. A vehicle can fully comply with every Federal Motor Vehicle Safety Standard and still contain a safety-related defect—a flaw that no standard happens to address. ODI's defect investigations reach exactly those flaws: the unanticipated failure modes, the design or manufacturing problems that emerge only in the field. This is why the investigation record is the leading edge of the safety system. The standards set the floor for how vehicles are built; the recalls record the remedies ordered; but it is the investigations that occupy the deciding middle, where the agency works out whether a pattern of failures amounts to a defect at all.

The escalation ladder: PE to EA to recall

The single most important thing to understand about this dataset is that an ODI investigation is not a one-step event but a process that escalates through defined stages, and the action number prefix tells you which stage a record represents. Reading those stages in order is how you reconstruct the path from a first signal to a recall.

An inquiry typically opens as a Preliminary Evaluation (PE). When complaint patterns, early-warning data, a petition, or another signal suggest a possible defect, ODI opens a PE to take a first, screening-level look: it gathers the relevant complaints and field reports, sends the manufacturer an information request asking what it knows about the alleged failure, and assesses whether the problem is real, safety-related, and widespread enough to warrant deeper work. A Preliminary Evaluation is meant to be relatively quick—a triage stage—and it ends one of three ways: the manufacturer announces a recall, removing the need for further agency action; ODI closes the inquiry because the evidence does not support a defect finding; or ODI upgrades it to the next stage.

That next stage is an Engineering Analysis (EA). An EA is the deeper, more resource-intensive investigation: ODI conducts engineering evaluations, may inspect or test vehicles, examines the failure mechanism in detail, analyzes the full body of complaint and warranty data, and presses the manufacturer for more comprehensive information. The Engineering Analysis is where the agency builds the technical case for whether a safety-related defect exists. Like the PE, it can end in a manufacturer recall or a closure—but it can also end in the agency's most assertive non-mandatory step. Alongside these two principal stages, the dataset also carries other action types: a Recall Query (RQ), which examines the scope or adequacy of a recall the manufacturer has already conducted; and a Defect Petition (DP), opened when an outside party formally petitions the agency to investigate. The prefix on the action number is what lets an analyst sort the record by exactly these categories.

The terminal step, when the agency concludes a defect exists but the manufacturer will not act voluntarily, is a recall-request letter—and, if that does not produce a recall, an initial decision that a defect exists followed by a public meeting and, ultimately, a formal order. In practice the overwhelming majority of defects are recalled voluntarily once an investigation makes the case; the manufacturer reads the same data the agency does and chooses to recall rather than litigate a federal defect finding. The recall-request letter is the rarely-pulled lever whose existence does most of the work, because its credible threat is what converts an investigation into a voluntary recall. For the dataset, the consequence is that most investigations resolve as either a recall or a closure, with the campaign-number field recording which inquiries crossed the line into a remedy and which did not.

Landmark cases the record traces

ODI's investigations have driven some of the largest and most consequential safety actions in United States history, and the investigation record is where each of them can be followed from an early signal to a recall. Four cases define the office's history.

Firestone tires on Ford Explorers. Around the turn of the millennium, certain Firestone tires—many fitted as original equipment on the Ford Explorer—were failing by shedding their tread, and the resulting blowouts contributed to rollover crashes and a large number of deaths and injuries. The episode exposed how slowly the existing signals had surfaced the problem and how much the manufacturers had known internally before the public did. It produced massive tire and vehicle recalls and, directly, the TREAD Act and its Early Warning Reporting regime—making Firestone not just a landmark investigation but the case that reshaped the entire defect-detection system this dataset documents.

Toyota unintended acceleration. A cluster of reports of vehicles accelerating without driver input led to high-profile ODI investigations and very large recalls addressing floor-mat entrapment of the accelerator pedal and sticking pedal mechanisms. The episode became a defining test of how the agency investigates a complex, multi-cause failure mode in which mechanical, electronic, and human factors are entangled, and it ended in some of the most significant enforcement penalties of the era for the way the defects had been handled.

The GM ignition switch. A defective ignition switch that could slip out of the run position—cutting power to the engine and, critically, disabling the airbags—was tied to numerous deaths and injuries and to a recall of millions of vehicles. The case is studied because of the long gap between the early indications of the problem and the eventual recall; it prompted intense scrutiny of how warning signals are triaged, how internal manufacturer knowledge is surfaced, and how an investigation escalates, and it remains a reference point for the failure modes the investigation process is meant to catch sooner.

Takata airbag inflators. The defect with the widest reach of all: airbag inflators that, after prolonged exposure to heat and humidity, could rupture when deployed—turning a safety device into a source of shrapnel inside the cabin. The Takata defect produced the largest automotive recall in United States history, spanning tens of millions of vehicles across many manufacturers, coordinated by NHTSA over years. The investigation and coordinated-remedy program around Takata show the dataset at its most ambitious scale: a single defect in a shared component, drawing investigations and recalls across nearly the entire industry, tracked through the same action-number and campaign- number machinery as the smallest single-model inquiry.

Joining to the complaint and recall datasets

The defect-investigations table is most valuable as the hinge between the two datasets it sits between, and the join keys make that position explicit. The investigation record points backward to the complaints that often triggered it and forward to the recalls it produced.

Looking backward to complaints, the connection is by subject—the same make, model, model years, and component. Consumer complaints, filed with NHTSA as Vehicle Owner Questionnaires, are one of the principal raw signals from which ODI decides to open an investigation: a rising count of complaints describing the same failure in the same vehicles is exactly the pattern a Preliminary Evaluation is opened to examine. By aligning an investigation's subject vehicle and component with the complaint dataset, an analyst can reconstruct the signal that preceded the file—how many complaints had accumulated, over what period, describing what failure—and can study the lead time between the complaint pattern emerging and ODI acting on it. That lead time is one of the most policy-relevant quantities in the whole safety system, because it measures how quickly the agency turns a public signal into an investigation.

Looking forward to recalls, the join is direct and clean: the recall campaign number recorded on an investigation is the same identifier that keys the recall dataset. Where an investigation produced a recall, the campaign number links it to the recall record's affected-population count, the defect description, the remedy the manufacturer offered, and the completion rate—the share of affected vehicles actually fixed. This is what closes the loop: it lets an analyst follow a single defect from the complaint pattern that raised it, through the Preliminary Evaluation and Engineering Analysis that worked it, to the recall that remedied it and the completion rate that measured whether the remedy reached the road. Treated together, the three datasets— complaints, investigations, recalls—form a single pipeline keyed on vehicle subject and campaign number, and the investigation table is the stage in the middle that the other two cannot supply on their own.

Analytical uses

A complete record of the inquiries the agency chose to pursue—dated, typed, and tied to their outcomes—supports a distinctive set of analyses about how the federal government polices vehicle safety in the space between recalls.

Investigations by manufacturer and component. The most immediate use is to count investigations by manufacturer and by the component at issue, which reveals where the agency's attention concentrates—which manufacturers draw the most inquiries, and which systems (airbags and restraints, steering, fuel and propulsion, electrical, brakes, tires) recur as the subjects of investigation. Read against the size of each manufacturer's fleet, these counts begin to distinguish problems that scale with production volume from genuine reliability concentrations. Component patterns over time also track the shifting frontier of vehicle risk, as electronic and software-driven failure modes take a larger share alongside the mechanical defects that dominated earlier eras.

Investigation duration and escalation. Because each record carries an open date and a close date and a type prefix, an analyst can measure how long investigations take and how often they escalate—the share of Preliminary Evaluations that upgrade to an Engineering Analysis, and the time each stage consumes. These durations are a measure of agency throughput and a lens on the cases that move slowly, which is exactly where the most-scrutinized failures—the long gaps between early signal and recall—tend to live. The escalation rate, in particular, says something about how selectively the office triages: a PE is a screen, and the fraction that survives to an EA reflects how much of the incoming signal turns out to merit deep investigation.

Recall-conversion rate. The campaign-number field supports the dataset's signature metric: the share of investigations that end in a recall. Computed overall, by manufacturer, by component, and by investigation type, the conversion rate describes how often the agency's scrutiny actually produces a remedy—and how that varies. It is the clearest single summary of what the investigation process accomplishes. Finally, linking the full pipeline—complaints to investigations to recalls—supports the lead-time and effectiveness analyses that no single dataset can produce alone: how fast a complaint pattern becomes an investigation, how fast an investigation becomes a recall, and whether the recall reached the affected vehicles.

Python workflow: investigations by type, component, and recall outcome

NHTSA publishes the ODI defect investigations both as documented bulk flat files—a pipe-delimited download of every investigation, keyed by action number—and through the agency's public REST API at api.nhtsa.dot.gov, which is most useful here for confirming the recalls an investigation produced. No API key is required for the public data. The script below downloads the investigation flat file, derives the investigation type from the action-number prefix (Preliminary Evaluation versus Engineering Analysis versus the other types), tallies the most-investigated components, and computes the recall-conversion rate from the campaign-number field. Because the ODI flat-file layout changes between refreshes, any production use should be validated against the current ODI file-format documentation and should treat the column order below as a starting point to confirm, not a fixed contract. Requirements: requests and pandas.

import requests, io, zipfile
import pandas as pd
from collections import Counter

# NHTSA Office of Defects Investigation (ODI) -- public data, no key.
# Two complementary public sources are used together:
#   1. The ODI investigation flat files (FLAT_INV), a pipe-delimited
#      bulk download of every defect investigation, keyed by the
#      investigation / action number (e.g. PE24-001, EA23-004).
#   2. The documented REST API at api.nhtsa.dot.gov, used here only to
#      confirm the recalls that an investigation produced.
# The flat-file layout and download path change between ODI refreshes,
# so the column names are resolved at runtime rather than hard-coded.
FLAT_INV = "https://static.nhtsa.gov/odi/ffdd/inv/FLAT_INV.zip"
RECALL_API = "https://api.nhtsa.dot.gov/recalls/recallsByVehicle"

# Field order published in the ODI investigation file-format document.
INV_COLS = [
    "NHTSA_ACTION_NUMBER", "MAKE", "MODEL", "YEAR", "COMPONENT",
    "MFR_NAME", "ODATE", "CDATE", "CAMPNO", "SUBJECT", "SUMMARY",
]


def load_investigations(url=FLAT_INV):
    r = requests.get(url, timeout=300)
    r.raise_for_status()
    zf = zipfile.ZipFile(io.BytesIO(r.content))
    name = next(n for n in zf.namelist() if n.upper().endswith(".TXT"))
    with zf.open(name) as fh:
        df = pd.read_csv(fh, sep="|", names=INV_COLS, dtype=str,
                         header=None, encoding="latin-1",
                         on_bad_lines="skip")
    return df


df = load_investigations()
print(f"Investigation records loaded: {len(df):,}")

# --- 1. Type mix: Preliminary Evaluation vs Engineering Analysis -------
# The action number prefix encodes the investigation type:
#   PE = Preliminary Evaluation   EA = Engineering Analysis
#   RQ = Recall Query             DP = Defect Petition  (and others)
def inv_type(action):
    a = (action or "").strip().upper()
    return a[:2] if len(a) >= 2 and a[:2].isalpha() else "??"

df["TYPE"] = df["NHTSA_ACTION_NUMBER"].map(inv_type)
print("\nInvestigations by type:")
for t, n in df["TYPE"].value_counts().items():
    print(f"  {t:<4} {n:>6,}")

# --- 2. Most-investigated components -----------------------------------
print("\nTop 12 components under investigation:")
for comp, n in df["COMPONENT"].fillna("(none)").value_counts().head(12).items():
    print(f"  {comp[:42]:<42} {n:>5,}")

# --- 3. Investigations that produced a recall --------------------------
# CAMPNO carries the recall campaign number when an investigation ended
# in a recall. A non-empty CAMPNO is the link to the recall dataset.
has_recall = df["CAMPNO"].fillna("").str.strip().ne("")
rate = has_recall.mean()
print(f"\nInvestigations linked to a recall campaign: "
      f"{int(has_recall.sum()):,} ({rate:.1%} of all investigations)")


def recalls_for_vehicle(make, model, year):
    p = {"make": make, "model": model, "modelYear": year}
    r = requests.get(RECALL_API, params=p, timeout=30)
    r.raise_for_status()
    return r.json().get("results", [])

Two notes on the script. First, the type classification leans entirely on the action-number prefix, which is the genuine convention ODI uses—PE for Preliminary Evaluation, EA for Engineering Analysis, RQ for Recall Query, DP for Defect Petition—so reading the first two letters is a reliable way to bucket the record by stage; but it does not by itself capture an inquiry that began as a PE and was later upgraded to an EA, which appears as two related records. A rigorous escalation analysis must thread those linked records together by subject and timing rather than treating each action number as an isolated event. Second, the recall-conversion calculation here uses the presence of a campaign number as the signal that an investigation ended in a recall; that is the right first pass, but the cleanest conversion analysis joins on the campaign number into the recall dataset itself, which carries the authoritative affected-population counts and completion rates and lets an analyst weight an investigation by how many vehicles its recall actually reached.

Limitations and analytical caveats

The defect-investigations record is the most complete public account of how the agency works potential safety defects, but it carries structural features an analyst must internalize before drawing conclusions from it.

It is a discretionary record, not a census of defects.The dataset contains the inquiries ODI chose to open, which is not the same as the set of defects that exist. The office has finite resources and triages a far larger stream of complaints and early-warning signals than it can investigate; the decision to open a Preliminary Evaluation is itself an act of judgment shaped by the apparent severity, the volume of the signal, and the office's priorities at the time. An absence of investigations for a given manufacturer or component therefore cannot be read as evidence that no defect existed—only that the agency did not formally pursue one. The data describes the agency's attention as much as it describes the vehicles.

A closure without action is not an exoneration, and a recall is not a defect finding. An investigation that closes without a recall may close because the evidence did not support a defect, because the manufacturer recalled voluntarily while the inquiry was pending, or because the agency redirected its resources—the closure code and the summary, not the bare fact of closure, carry the reason. Conversely, the great majority of recalls tied to investigations are voluntary: the manufacturer chose to recall rather than contest a federal defect finding, which means a recall reflects a negotiated outcome more often than an adjudicated defect. Reading either outcome as a clean legal verdict over-reads what the record states.

The coded fields summarize; the narrative carries the texture.The component field, the dates, and the type prefix compress a complex investigation into a few structured values. What actually went wrong, how serious it was, what the manufacturer's information request revealed, and why the office reached the outcome it did all live in the investigation's summary and supporting documents, not in the coded columns. The component taxonomy is also coarse and has shifted over time, so aggregations by component should be sanity- checked against the underlying subjects rather than trusted as a stable, fine-grained classification. The structured data is excellent for counting and joining; it is a poor substitute for reading the record when the question is qualitative.

There is reporting and resolution lag. An investigation opens, runs for months or years, and closes only when the agency resolves it—so any snapshot of the dataset contains open investigations whose ultimate outcome is not yet recorded, and the recall-conversion rate computed on a current extract will understate the eventual conversion of the most recent inquiries, which have not had time to resolve. Duration metrics are reliable for closed cases; conversion and outcome metrics are reliable for cohorts old enough to have run their course. As with the complaint and recall data it joins to, this dataset is authoritative for established patterns and multi-year trends, not as a real-time monitor of what was opened last month.

Held with those caveats, the nhtsa_investigations table is a uniquely valuable resource: roughly 5,300 dated, typed, outcome-linked records of how the federal government decides, case by case, that something on the road is dangerous enough to call back—the deciding middle of a safety system whose signals arrive in the complaint record and whose remedies are recorded in the recall record, with the investigations standing between them as the account of how a defect actually becomes a recall.

Related writing

NHTSA Vehicle Recall Data: 70 Years of Safety Defects Across 900 Million Vehicles — The downstream half of the same pipeline: where the investigation record ends in a recall, the campaign number joins straight to the recall dataset's affected-population counts, remedies, and completion rates, closing the loop from defect inquiry to fixed vehicle.

NHTSA Vehicle Safety Complaints: The Federal Database Behind Auto Defect Investigations and Recalls — The upstream signal: the consumer complaints whose accumulating patterns most often supply the trigger that opens a Preliminary Evaluation, and the dataset that lets you measure the lead time from a complaint pattern to an investigation.

Every US traffic death since 1975: using NHTSA FARS to analyze road safety, vehicle defects, and enforcement gaps — The outcome the whole defect program exists to prevent: the fatality record that, joined to investigations and recalls, lets an analyst ask whether the crashes a defect caused show up in the national count of road deaths.