The Aviation Safety Pipeline: From NTSB Accident to FAA Airworthiness Directive

An aircraft comes down. Investigators pull the wreckage apart, trace the failure to a cracked spar, a corroded fitting, a fuel line that chafed against a bracket nobody thought to inspect, and write down what went wrong. Then—sometimes—a different agency reads that finding and makes the fix mandatory for every other aircraft of the same type still flying. That arc, from an unsafe condition discovered in a single accident to a legally binding repair imposed across a fleet, is the central feedback loop of American aviation safety. But the two halves of it live in different agencies' databases, and connecting them is how you measure whether investigations actually drive corrective action.

This article covers the division of labor that defines the pipeline—the National Transportation Safety Board investigates and the Federal Aviation Administration regulates, and why that split puts the finding and the fix in separate datasets; what each of the four federal sources contains and how it is keyed; the airworthiness directive as the legal instrument that converts an unsafe condition into a mandatory action; the make-and-model and type-certificate join keys that stitch the sources together, and why aligning them is the real engineering work; how a safety recommendation becomes a directive, and the lag between accident and rule; how the registered fleet's exposure to open directives is measured; the analytical questions the assembled chain answers; a Python workflow that pulls accidents for a model, ties them to the registered fleet, and points at the directives written against it; and the caveats—string matching across agencies, reporting lag, and the correlation-versus-causation trap—that every analyst must internalize before drawing conclusions.

Two agencies, one feedback loop

The single most important fact about the aviation-safety pipeline—the fact that explains why it is fragmented across databases in the first place—is the deliberate separation of investigation from regulation. The National Transportation Safety Board (NTSB) is an independent federal agency whose job is to investigate transportation accidents and determine their probable cause. It was made independent on purpose: an investigator that answered to the regulator could be tempted to find causes that did not implicate the regulator's own certification decisions. The NTSB therefore investigates, finds facts, determines probable cause, and issues safety recommendations—but it has no regulatory power whatsoever. It cannot ground an aircraft, cannot fine an operator, cannot order a repair. Its only instrument is persuasion: the published finding and the recommendation that flows from it.

The Federal Aviation Administration (FAA) is the opposite: it is the regulator. It certifies aircraft types, issues the type certificates and production certificates that let an aircraft be built and sold, certifies pilots and mechanics, runs the air traffic system, and—crucially for this pipeline—issues the legally binding rules that keep aircraft airworthy. When an unsafe condition is identified in a type design, it is the FAA, not the NTSB, that can make a fix mandatory, through the instrument called an airworthiness directive. The consequence of this division is structural: the discovery of an unsafe condition (the accident, the probable cause, the recommendation) lives in NTSB data, and the mandatory corrective action that may follow it (the airworthiness directive) lives in FAA data. Neither agency's dataset contains the whole loop. Reassembling the loop—connecting a Board finding to the directive the FAA later issued—is the entire point of treating these as one pipeline rather than four separate catalogs, because only the assembled chain can tell you whether an investigation actually produced a fix, and how long it took.

The four datasets

The pipeline draws on four federal aviation datasets, each parsed and keyed by make and model so that the analyst's job is aligning strings and dates across agencies rather than re-parsing each raw source. The first is the NTSB aviation accident record (stored as ntsb_aviation), the federal record of US civil-aviation accident and incident investigations: one row per investigated event, naming the aircraft's make, model, and registration (N-number), the location and date, the injury severity, the phase of flight, and—once the investigation closes—the probable cause. The second is the FAA airworthiness-directive and rulemaking record (stored as faa_airworthiness_directives), the catalog of mandatory safety actions and the Federal Register rulemaking that promulgates them: one row per directive, naming the make-and-model applicability, the unsafe condition, the required action, the compliance deadline, and the effective date.

The third is the FAA civil-aviation registry (stored as faa_registry), the master register of US-registered civil aircraft: one row per N-number, mapping each tail number to its make, model, serial number, year of manufacture, and registered owner. The fourth is the FAA safety databases(stored as faa_safety), the broader family of FAA safety and enforcement records —service difficulty reports, accident and incident data the FAA itself maintains, and related airworthiness and enforcement information—that supplements the Board's investigation record with the regulator's own operational view. Together these are four FAA and NTSB aviation datasets, and the value of holding them as one collection is that every one of them carries, somewhere, the make-and-model of the aircraft involved—which is exactly the column that lets them be joined. The fields that carry the connective information look, schematically, like this:

-- ntsb_aviation (one row per investigated accident/incident)
ntsb_number          -- the NTSB case identifier for the investigation
event_date           -- date of the accident or incident
make                 -- aircraft manufacturer (e.g. CESSNA, PIPER, BOEING)
model                -- aircraft model designation (e.g. 172, PA-28, 737)
registration_number  -- the N-number (tail number) of the aircraft
probable_cause       -- the Board's determined cause (after the report)

-- faa_airworthiness_directives (one row per AD / rulemaking action)
ad_number            -- the airworthiness directive identifier
make_model_applic    -- the make-and-model (+ serial range) the AD applies to
unsafe_condition     -- the condition the AD is issued to correct
required_action       -- the mandatory inspection/repair/replacement
effective_date       -- the date compliance becomes mandatory

-- faa_registry (one row per registered N-number)
n_number             -- the tail number, without the leading 'N'
mfr_mdl_code         -- code linking to the make-and-model reference table
make / model         -- manufacturer and model (via the reference join)
serial_number        -- airframe serial -- ties an AD's serial range to a tail

The grain differs by table and that difference is the source of most of the analytical care the pipeline demands. An NTSB row is an event: one accident, one aircraft (occasionally more), one investigation. An airworthiness-directive row is a rule: one mandatory action written against a make-and-model and frequently a serial-number range, applying to however many airframes fall inside that range. A registry row is an airframe: one physical aircraft, identified by its N-number and serial number. The pipeline's power comes from the fact that all three speak the same language at one point—the make and model—even though they describe events, rules, and objects respectively.

The airworthiness directive as the mandatory fix

An airworthiness directive (AD) is the legal instrument at the downstream end of the pipeline, and understanding what it is and is not is essential to reading the data. An AD is a regulation—issued by the FAA under its authority in Title 14 of the Code of Federal Regulations, Part 39—that applies to a type of aircraft, engine, propeller, or appliance when the FAA finds that an unsafe condition exists in that type design and is likely to exist or develop in other products of the same type design. Because it is a regulation rather than advice, compliance is mandatory: no person may operate an aircraft to which an AD applies except in accordance with that AD. An operator who ignores an applicable directive is flying an unairworthy aircraft, with all the legal and insurance consequences that follow.

Three features of the AD make it the right object to anchor the pipeline. First, it is written against a type, not an individual airframe—its applicability statement names a make and model (a Cessna 172, a Boeing 737, a particular engine model) and often a serial-number range or a list of affected serial numbers. That is precisely why make-and-model is the join key: the AD does not know about your specific tail number until you map its serial range onto the registry. Second, it carries an effective date and a compliance time—sometimes a fixed number of flight hours or calendar days, sometimes “before further flight” for the most urgent conditions—which is the datum that lets the pipeline measure how fast a fix was mandated. Third, the AD records the unsafe condition it corrects, frequently in language that traces directly back to a specific accident, incident, or field report. The most urgent ADs are issued as emergency airworthiness directives, effective immediately, for conditions that cannot wait for the normal notice-and-comment rulemaking cycle; the routine ones go through the Federal Register as a notice of proposed rulemaking before becoming final. Both forms land in the same dataset, distinguished by their effective dates and the urgency of the compliance language.

The join keys: make, model, and the type certificate

The connective tissue across all three primary datasets is the aircraft make and model, and the deeper conceptual link beneath it is the type certificate. When a manufacturer designs a new aircraft, the FAA issues a type certificate certifying that the design meets the applicable airworthiness standards; every airframe built to that approved design shares the type certificate, and an AD is, in effect, a finding that something in that type-certificated design is unsafe. So the make-and-model is not an arbitrary label—it is a proxy for the shared type design that makes a fix discovered on one airframe relevant to all the others. The chain runs like this: an NTSB accident record names the make, model, and N-number of the accident aircraft; the FAA registry maps each N-number to its make, model, and serial number; and an airworthiness directive is written against a make-and-model (and serial-number range). Make-and-model is therefore the column that appears in all three, and it is what lets an accident, a rule, and a fleet be talked about together.

The N-number deserves a word, because it is the registry's primary key and the bridge between the event data and the airframe data. Every US-registered civil aircraft carries a unique registration number—an “N-number,” so called because US registrations begin with the letter N. The N-number appears on an NTSB accident record (it identifies the specific aircraft that crashed) and is the key of the registry (which maps it to a make, model, and serial number). So a single accident can be resolved all the way down to the exact airframe and then back up to the type, while an AD's serial-number range can be projected forward onto the registry to enumerate the specific N-numbers it covers. The N-number gives you the individual airframe; the make-and-model gives you the population of airframes that share its fate.

The hard part—and the reason this is a data-engineering article and not merely a policy one—is that the make-and-model string is spelled differently in every system. The NTSB might record a manufacturer as “CESSNA” while the registry's reference table calls it “CESSNA AIRCRAFT CO” and an AD's applicability text says “Cessna Aircraft Company.” A model is “172N” in one place, “172 N” in another, and “172” with the variant carried in a separate field in a third. The registry does not even store the make-and-model directly on the master record; it stores a manufacturer/model code (MFR MDL CODE) that must be joined to a reference table (ACFTREF) to recover the human-readable make and model. Aligning these strings—normalizing case, collapsing spacing and punctuation, reconciling manufacturer-name variants, and matching a directive's serial-number range against the registry's serial field—is the bulk of the work. Get it right and the four datasets snap into one accountable chain; get it wrong and the join silently drops the very accidents and directives the analysis is meant to connect.

From safety recommendation to directive: the lag

The most analytically interesting question the assembled pipeline can answer is also the one that most clearly requires both agencies' data: how long does it take an unsafe condition to travel from accident finding to mandatory directive, and how often does it make the journey at all? The path is institutional. The NTSB, having found a probable cause, issues a safety recommendation—often addressed to the FAA—urging a specific action. Because the NTSB cannot regulate, the recommendation is not self-executing: the FAA must decide whether to act, and if it acts, the instrument is typically an airworthiness directive (or, for broader design issues, a change to the certification standards). The recommendation is the request; the AD is the response; and the interval between them is a measurable quantity once the two datasets are joined.

Measuring that interval is genuinely illuminating because the relationship is neither automatic nor uniform. Some unsafe conditions move fast: a fatal accident exposes a flight-control or structural failure, the FAA issues an emergency AD within days or weeks, and the lag is short. Others move slowly or not at all: the NTSB issues a recommendation, the FAA studies it, weighs the cost of compliance across the affected fleet against the assessed risk, and either issues a directive years later, addresses the condition through some other means, or declines to act—closing the recommendation as “unacceptable” in the Board's tracking. By joining the accident and recommendation records to the directive record on the affected make-and-model and ordering them by date, an analyst can compute the accident-to-directive lag for the conditions that did produce a directive, and—harder but more revealing—estimate how often a Board finding produced no directive at all. That second number is the empirical test of how well the persuasion-only design of the NTSB actually drives corrective action through the regulator that holds the power.

Fleet exposure: projecting directives onto the registry

The third leg of the pipeline turns the directive back onto the living fleet. An AD is a rule against a type; the registry is the census of the airframes of that type still flying. Joining them answers the exposure question: how many registered aircraft does a given directive apply to, and how many open directives does a given aircraft carry? Because the registry maps every N-number to a make, model, and serial number, and an AD is written against a make-and-model and serial range, the directive's applicability can be projected directly onto the registered fleet—turning an abstract rule into a concrete list of affected tail numbers.

This projection is what makes the pipeline operationally useful rather than merely historical. From the fleet side, it reveals which aircraft types and components draw the most directives—the airframes and engine models that have accumulated the longest list of mandatory inspections and repairs over their service lives, which is a rough index of where the type design has proven troublesome. From the directive side, it reveals the magnitude of each rule: an AD against a common training aircraft with tens of thousands of registrations imposes a vastly larger aggregate compliance burden than one against a rare type with a handful of airframes, even if the per-aircraft fix is identical. Aggregated, the projection produces a fleet-wide picture of open-directive exposure—which models carry the heaviest airworthiness obligations, how that burden concentrates by manufacturer and era, and where an aging fleet is accumulating directives faster than it is retiring airframes. None of this is visible from the AD catalog or the registry alone; it requires the join.

What the assembled pipeline answers

Assembled, the chain answers a set of questions that no single database can. The first is the latency question already described—the time from an accident finding to the mandatory directive that corrects it—which is the empirical measure of how responsive the regulatory side of the loop is to the investigative side. The second is the efficacy question: how often does an NTSB safety recommendation actually become an FAA airworthiness directive, as opposed to being addressed some other way or declined? That ratio is the clearest available proxy for the practical force of the Board's findings.

The third is the concentration question: which aircraft types and which components—airframe structures, engines, fuel systems, landing gear, flight controls—draw the most directives, and how does that concentration map onto accident history? A type that appears frequently in both the accident record and the directive record, with the directives tracing the accidents, is the pipeline working as designed; a type heavy in accidents but light in directives is a prompt for closer scrutiny. The fourth is the exposure question: how is the registered fleet exposed to open directives, and where does the compliance burden fall hardest? And underlying all of them is the simplest framing of the whole exercise: turning a fragmented set of databases—an investigator's findings here, a regulator's rules there, a census of airframes somewhere else—into one accountable chain that follows a single unsafe condition from the accident that surfaced it, through the rule that fixed it, to the aircraft the rule protects.

Python workflow: accidents, fleet, and directives for one model

The script below pulls NTSB accident records for a single aircraft model from the Board's public CAROL query service, loads the FAA releasable aircraft registry from its flat-file extracts (the MASTER file of N-numbers joined to the ACFTREF make-and-model reference table), normalizes the make-and-model string into a comparable key, and ties the accident history to the registered fleet—computing the model's accident count, its count of still-registered airframes, and a coarse accidents-per-airframe figure. The airworthiness directives that complete the chain are matched in a second step against the FAA Dynamic Regulatory System, by querying for ADs whose applicability text names the same make-and-model and then differencing the directive's effective date against the precipitating accident date. All three sources are public and key-free. Requirements: requests and pandas; the registry flat files are downloaded from the FAA's releasable-registry distribution.

import requests, re
import pandas as pd
from collections import Counter

# Three public, key-free aviation sources, joined on make-and-model:
#   1. NTSB aviation investigation data (CAROL query / avall extract)
#   2. FAA airworthiness directives + Federal Register rulemaking
#   3. FAA releasable civil-aviation aircraft registry (master + acftref)
#
# The pipeline answers a question no single source can: for one aircraft
# model, how do accidents, mandatory directives, and the registered fleet
# line up? The connective tissue is the make-and-model string, which is
# spelled differently in every system -- so the real work is normalizing it.

NTSB = "https://data.ntsb.gov/carol-main-public/api/Query/Main"
FAA_AD = "https://drs.faa.gov/api"   # FAA Dynamic Regulatory System (DRS) search


def norm_model(make, model):
    # Collapse the make-and-model into a comparable key. Registries and
    # ADs disagree on spacing, hyphens, and the "172N" vs "172 N" forms.
    s = f"{make} {model}".upper()
    s = re.sub(r"[^A-Z0-9 ]", " ", s)
    return " ".join(s.split())


def ntsb_accidents_for_model(make, model, limit=2000):
    # CAROL accepts a JSON query; filter to aviation events for one model.
    body = {
        "ResultSetSize": limit,
        "QueryGroups": [{
            "QueryRules": [
                {"FieldName": "Vehicles.Make", "Condition": "Contains", "Values": [make]},
                {"FieldName": "Vehicles.Model", "Condition": "Contains", "Values": [model]},
            ],
            "AndOr": "and", "RuleType": "Simple",
        }],
    }
    r = requests.post(NTSB, json=body, timeout=120)
    r.raise_for_status()
    rows = r.json().get("Results", [])
    return pd.DataFrame(rows)


def registered_fleet(acftref_csv="ACFTREF.txt", master_csv="MASTER.txt"):
    # The releasable registry ships as flat files: MASTER (one row per
    # N-number) joined to ACFTREF (one row per make-and-model type code).
    ref = pd.read_csv(acftref_csv, dtype=str).rename(columns=str.strip)
    mas = pd.read_csv(master_csv, dtype=str).rename(columns=str.strip)
    fleet = mas.merge(ref, left_on="MFR MDL CODE", right_on="CODE", how="left")
    fleet["_key"] = [norm_model(m, d) for m, d in zip(fleet["MFR"], fleet["MODEL"])]
    return fleet


def analyze(make, model):
    acc = ntsb_accidents_for_model(make, model)
    fleet = registered_fleet()
    key = norm_model(make, model)

    # --- 1. Accident history for the model -------------------------------
    n_acc = len(acc)
    print(f"{make} {model}: {n_acc:,} NTSB accident/incident records")

    # --- 2. Fleet exposure: how many of this model are still registered --
    on_reg = fleet[fleet["_key"] == key]
    print(f"  Active registrations matching the model key: {len(on_reg):,}")

    # --- 3. Accident rate per registered airframe (coarse) ---------------
    if len(on_reg):
        rate = n_acc / len(on_reg)
        print(f"  Accidents per registered airframe (lifetime, coarse): {rate:.3f}")

    # NOTE: airworthiness directives are matched separately by querying the
    # FAA DRS for ADs whose applicability text names this make-and-model;
    # the AD’s effective date minus the precipitating accident date is the
    # accident-to-directive lag this pipeline exists to measure.
    return acc, on_reg


analyze("CESSNA", "172")

Two things about the script deserve emphasis. First, the norm_model helper is not incidental polish—it is the load-bearing component, because the make-and-model string is the only thing the four sources share and they all spell it differently. The version shown collapses case, punctuation, and spacing, which handles the easy mismatches; a production pipeline needs a manufacturer-name crosswalk (mapping “CESSNA,” “CESSNA AIRCRAFT CO,” and “Cessna Aircraft Company” to one canonical maker) and serial-range parsing so that a directive's applicability can be tested airframe by airframe rather than only at the model level. Second, the accidents-per-airframe figure is deliberately coarse—it divides a lifetime accident count by a current registration count, which mixes time frames and ignores flight hours—so it is a sorting heuristic, not a true accident rate. The honest rate metrics require exposure data (flight hours or departures) that the federal datasets do not carry, which is the first of the caveats the next section lays out.

Limitations and analytical caveats

The aviation-safety pipeline is uniquely powerful precisely because it spans agencies, but that span is also the source of its hazards. Several features must be held firmly in mind before drawing conclusions from the assembled chain.

String matching across agencies is the dominant source of error.The whole pipeline rests on the make-and-model join, and the make-and-model is free-ish text that each agency records in its own conventions. Manufacturer names vary, model designations carry or drop their variant suffixes, and the registry hides the human-readable make-and-model behind a code that must itself be joined to a reference table. Every imperfection in the normalization either drops a true match (silently understating how many accidents or directives a model has) or creates a false one (lumping unrelated types together). Because the errors are silent—a missed join produces no warning, just a smaller number—the quality of the entire analysis is bounded by the quality of the entity resolution applied to the make-and-model strings, and any serious result must be spot-checked back to the underlying records by N-number and serial number.

The accident record lags, and an open investigation has no cause yet.An NTSB investigation can take a year or more to close, and the probable cause—the field that actually states what went wrong—is not populated until it does. Any analysis that depends on the probable cause therefore systematically excludes the most recent accidents, whose investigations are still open, which means the leading edge of the data understates recent activity and cannot yet show the causes that may drive the next directives. The pipeline is authoritative for established, closed-out patterns; it is not a real-time monitor of what crashed last month or what the cause will turn out to be.

A directive does not always trace to an accident, and an accident does not always produce a directive. The clean narrative—accident, finding, recommendation, directive—is one path among several. Many ADs are issued from service-difficulty reports, manufacturer findings, fleet inspections, or in-service experience that never involved an accident at all; the unsafe condition was caught before anyone was hurt, which is the system working at its best. Conversely, many accidents produce no directive, because the cause was pilot error, weather, or a one-off maintenance lapse rather than a flaw in the type design that an AD could fix. Treating the accident-to-directive link as universal in either direction misreads the data: the join finds the cases where the loop closed, but the absence of a link is not evidence that the system failed—it is usually evidence that no design-level fix was warranted.

Counts are not rates, and the registry is exposure without denominators. Ranking models by accident count, or by directive count, mostly measures how many of that model exist and how long they have flown—a ubiquitous training aircraft will top every count simply because there are tens of thousands of them logging millions of hours. A genuine safety comparison needs a denominator: flight hours, departures, or aircraft-years of exposure. The federal datasets in this pipeline do not carry that exposure data, so the registry can tell you how many airframes exist but not how much they fly. Any claim that one type is “more dangerous” than another on the strength of raw counts is confusing volume with risk, and the coarse accidents-per-airframe figure in the script is a sorting aid, not a defensible rate.

Held with those caveats, the assembled aviation-safety pipeline is one of the most revealing cross-agency joins in federal data: it follows a single unsafe condition from the NTSB accident that surfaced it, through the FAA airworthiness directive that made the fix mandatory, down to the registered aircraft the directive protects—turning an investigator's finding, a regulator's rule, and a census of airframes into one accountable chain that no single database could provide, and that measures, in the gaps and lags between its links, how well American aviation actually learns from the accidents it cannot prevent.

Related writing

NTSB Aviation Accident Database: The Federal Record Behind Every US Aircraft Accident Investigation — The investigative half of the pipeline in full: the field-level guide to the accident record that supplies the make, model, N-number, and probable cause this chain begins from.

FAA Airworthiness Directives: The Federal Record of Mandatory Aircraft Safety Fixes — The regulatory half: the catalog of mandatory safety actions under 14 CFR Part 39 whose make-and-model applicability and effective dates make the accident-to-directive lag measurable.

FAA Civil Aviation Registry: The Federal Database Behind 700,000 Pilots and 300,000 Aircraft — The census of airframes that turns a directive against a type into a list of affected tail numbers, mapping each N-number to the make, model, and serial number the pipeline projects directives onto.