FRA Rail Accidents: The Federal Record of Every Reportable US Railroad Incident

When a train leaves the rail, when two consists meet on the same track, when a tank car ruptures at a grade crossing, a federal report follows. The railroad fills out a standardized form, names the cause, counts the dead and the injured, totals the dollar damage, and sends it to the Federal Railroad Administration, which has compiled every such report since 1975 into a single national ledger: roughly 224,000 reportable accidents and incidents, one row each, the most complete public record of what goes wrong on America's railroads and why.

This article covers what the rail-accidents dataset is and the FRA's safety mandate within the Department of Transportation; the reporting forms—the Form 6180.54 rail-equipment accident report at the system's core—and the monetary damage threshold that decides which events must be reported at all; the cause-code taxonomy that classifies each accident as a human-factor, track, equipment, signal, or miscellaneous failure, and why that taxonomy underlies every regulatory debate in the industry; the accident types, from derailment to collision to highway-rail grade-crossing strike, and the train consist each record carries; the trends the data documents—derailments and hazardous-material releases after the 2023 East Palestine, Ohio derailment, and the effect of Positive Train Control on collisions; how the accidents table joins to the FRA highway-rail grade-crossing inventory through the crossing identifier; a Python workflow that pulls the FRA Office of Safety Analysis download files, tallies events by cause code and railroad, and trends derailments by year; and the caveats—self-reporting, the moving threshold, and cause-code subjectivity—that every analyst must internalize first.

What the dataset is

The Railroad Accident/Incident Reporting System is the FRA's national database of reportable railroad accidents and incidents. Railroads operating in the United States are required by federal regulation to report qualifying events to the FRA, which compiles them into a structured record that reaches back to 1975. The system is not a sample or a survey—it is a mandatory census of every event that crosses the reporting threshold, and it is the foundation on which the agency, Congress, the National Transportation Safety Board, researchers, and the railroads themselves measure rail safety. At the center of the system is the Form 6180.54, universally called the Form 54: the rail-equipment accident/incident report that a railroad files for a collision, derailment, or other event involving on-track equipment.

In our database this record is stored as the table fra_rail_accidents, with the grain of one row per reported accident or incident: a single derailment is one row, a single grade-crossing collision is one row. The record comprises roughly 224,000 events. Each row identifies who reported it, where and when it happened, what equipment was involved, what the FRA judges to have caused it, and what it cost in lives, injuries, and dollars. The columns capture that anatomy:

railroad_code           -- reporting railroad (FRA alphabetic code, e.g. BNSF, UP)
incident_year           -- year the accident/incident was reported
incident_date           -- date of the event
state_county_city       -- location: state, county, nearest city / station
accident_type           -- derailment, head-on/rear-end collision, etc. (Form 54)
cause_code              -- primary cause code (human/track/equipment/signal/misc.)
contributing_cause      -- secondary contributing cause code, where assigned
train_consist           -- locomotives, loaded cars, empty cars, tons
speed_mph               -- train speed at the time of the accident
total_killed            -- fatalities attributed to the event
total_injured           -- reported injuries
total_damage            -- equipment + track damage in dollars (threshold field)
hazmat_cars_released    -- count of hazmat cars that released product
grade_crossing_id       -- DOT-AAR crossing ID, where a crossing is involved

Several of these columns carry more weight than the rest. The cause_code is the analytic heart of the file: it is the FRA's coded judgment of why the accident happened, and it is what makes root-cause analysis across the network possible. The total_damage field is the gatekeeper—because the obligation to file a Form 54 turns on a monetary threshold, the damage figure is both a severity measure and the variable that decides whether the event is in the dataset at all. The train_consist columns—the count of locomotives, loaded and empty cars, and the trailing tonnage—describe what was moving, which matters enormously for derailment dynamics and for the hazardous-material exposure that a long train of loaded tank cars represents. And the grade_crossing_id, present when a highway-rail crossing is involved, is the key that joins an accident to the crossing's own inventory record, tying the event to the physical infrastructure where it occurred.

The FRA and its safety mandate

The Federal Railroad Administration is the agency within the United States Department of Transportation charged with regulating railroad safety. It was created in 1966 as part of the legislation that established the Department of Transportation, and its safety authority was decisively strengthened by the Federal Railroad Safety Act of 1970, which gave the agency broad power to prescribe rules, regulations, and standards for every area of railroad safety and to enforce them across the national rail network. The FRA writes the federal railroad safety regulations—codified in Title 49 of the Code of Federal Regulations, where successive parts govern track safety standards, locomotive and freight-car equipment, signal and train-control systems, operating practices, hours of service, and hazardous-material transportation by rail—and it employs the safety inspectors who enforce them.

The accident-reporting requirement flows directly from that mandate. Under 49 CFR Part 225, railroads must report accidents, incidents, deaths, injuries, and occupational illnesses to the FRA, on prescribed forms and within prescribed timeframes, and must maintain the underlying records for inspection. The purpose, stated in the rule itself, is to provide the FRA with accurate information on the nature, causes, and circumstances of railroad accidents so that the agency can identify hazards, set priorities, and measure whether its regulations are working. This is the statutory and regulatory engine that fills the dataset: the rows exist because Part 225 compels the railroads to create them, and the consistency of the file—the same forms, the same fields, the same cause codes, decade after decade—is a direct product of that single, mandatory reporting standard. It is also why the FRA, not the railroads, is the authoritative custodian of the national record, and why the data is a federal-government product rather than an industry compilation.

The reporting forms and the damage threshold

Part 225 prescribes a small family of report forms, each capturing a different category of event, and understanding the family is the key to understanding the data's scope. The Form 6180.54—the rail-equipment accident/incident report, the Form 54—is the one this dataset centers on. It is filed for events involving on-track rail equipment: derailments, collisions, and other accidents in which railroad cars or locomotives are damaged. Alongside it, the Form 6180.55 and its supplements capture the railroad injury and illness summary—casualties to employees, passengers, and others—and a separate highway-rail grade-crossing report (the Form 6180.57) records the details of collisions between trains and highway users at crossings. A single serious event can generate more than one form: a grade-crossing collision that derails cars and injures people produces a crossing report, a rail-equipment report, and casualty entries that the FRA links together.

The single most consequential design feature of the whole system is the monetary damage threshold for the Form 54. A rail-equipment accident is reportable only if the damage to railroad on-track equipment, signals, track, track structures, and roadbed exceeds a dollar figure that the FRA sets and periodically adjusts. The threshold exists for a practical reason: without it, the agency would be flooded with reports of trivial coupler scrapes and minor yard bumps, drowning the signal in noise. With it, the dataset is restricted to events of at least a defined economic severity. But the threshold has a profound analytic consequence that this article returns to in the caveats: because it is a dollar amount and dollars inflate, the FRA raises it over time, and a derailment whose damage would not have cleared the bar in one year might clear it in another—or vice versa. The reporting threshold is therefore not a fixed line in the world but a moving administrative boundary, and any long-run trend in event counts is partly a story about where that boundary has been set, not only about how often trains actually derail.

The cause-code taxonomy

What elevates the rail-accidents file above a mere casualty ledger is its cause-code system. For every reportable rail-equipment accident, the reporting railroad assigns a primary cause code—and frequently a contributing cause code—drawn from a detailed FRA taxonomy that classifies the mechanism of failure. The codes group into five broad families, and the leading character of the code signals which family it belongs to.

Human factors covers accidents caused by the actions of people operating the railroad: a crew exceeding authorized speed, failing to obey a signal, improperly lining a switch, mishandling brakes, or making a yard-switching error. Human-factor causes are consistently one of the two largest families and are the category that crew-fatigue rules, operating-practice regulations, and train-control technology are designed to attack. Track causes attribute the accident to the condition of the roadway—a broken rail, a track-geometry defect, a failed joint, a wide gauge, a buckled track in summer heat—and they are the family that the FRA track safety standards and the railroads' inspection and maintenance programs target. Equipment causes point to the rolling stock and locomotives: a failed wheel, a broken axle, a defective bearing (the overheated wheel bearing implicated in major recent derailments falls here), a brake-system fault. Signal and communication causes implicate the signal and train-control systems themselves. And miscellaneousgathers everything else—weather, obstructions, vandalism, and causes that do not fit the other four families.

This taxonomy is what makes the dataset a tool for policy rather than just a chronicle. Because every accident carries a coded cause, an analyst can ask the questions that drive regulation: what share of derailments trace to track defects versus equipment failures? Are human-factor accidents falling as train-control technology spreads? Which cause families produce the most casualties per event, or the most hazmat releases? The cause code is the bridge from individual misfortune to systemic pattern, and it is the column on which the entire root-cause-analysis apparatus of American rail safety rests. The contributing-cause field adds a second dimension—many accidents have a chain of causation, and the secondary code captures the factor that combined with the primary one to produce the event.

Accident types and the train consist

Orthogonal to the cause is the accident type—what kind of event it physically was. The Form 54 taxonomy distinguishes derailments, the most numerous category, in which cars or locomotives leave the rail; collisions, which the form further subdivides into head-on, rear-end, side, and raking collisions and switching collisions depending on how the equipment met; highway-rail grade-crossing accidents, in which a train strikes a highway user at a crossing; and a set of other types covering obstructions, fires and explosions, and miscellaneous events. The type and the cause are independent axes—a derailment can be caused by a track defect, an equipment failure, or a human error—and analyzing them together is what yields the texture of rail risk: track-caused derailments behave differently from equipment-caused ones, and grade-crossing accidents have an entirely different causal profile because the proximate actor is usually the motorist, not the railroad.

Each rail-equipment record also carries the train consist: the makeup of the train involved. The consist fields record the number of locomotives, the number of loaded cars, the number of empty cars, and the trailing tonnage, along with the train's speed at the moment of the accident. This is not bookkeeping detail—it is central to severity analysis. A long, heavy train of loaded cars carries more kinetic energy and, when it derails, tends to pile up more cars than a short one; the position of a tank car in the consist and the number of loaded hazardous-material cars determine the scale of a potential release; and speed is one of the strongest predictors of how destructive a derailment becomes. The consist is also what lets analysts connect accident outcomes to the secular growth in train length—the industry's move toward longer, heavier trains is a live safety debate, and the consist fields are the data that lets it be studied empirically rather than anecdotally.

Derailments, hazmat, PTC, and East Palestine

The rail-accidents dataset is the evidentiary base for nearly every major rail-safety debate of the past two decades, and three intertwined threads—derailments and hazardous-material releases, Positive Train Control, and the renewed scrutiny that followed East Palestine—show how the data and the policy feed each other.

Derailments and hazmat releases are the events that turn a routine accident into a public emergency. Most derailments are minor—a few cars off the rail in a yard, no injuries, modest damage—but a derailment involving loaded tank cars of hazardous material can force evacuations, contaminate soil and water, and dominate national news. On February 3, 2023, a Norfolk Southern train derailed in East Palestine, Ohio, releasing hazardous chemicals including vinyl chloride and prompting a large evacuation and a long environmental cleanup. The derailment, which the NTSB investigation traced to an overheated wheel bearing—an equipment-cause mechanism that the accident record captures—brought a wave of national attention to rail safety, renewed congressional interest in railroad regulation, and intensified scrutiny of wayside defect detectors, tank-car standards, and train length. The accident data is precisely what lets such an event be placed in context: how does the rate of hazmat-involved derailments trend, are equipment-caused derailments rising or falling, and how does one catastrophic event compare to the long baseline the dataset preserves.

Positive Train Control (PTC) is the clearest case of the data driving and then measuring a regulatory intervention. PTC is a set of technologies designed to automatically prevent the specific accident types that human error produces—train-to-train collisions, derailments caused by exceeding a speed restriction, incursions into established work zones, and movements through misaligned switches. After a series of high-profile collisions, the Rail Safety Improvement Act of 2008 mandated the implementation of PTC across a large share of the national network, and the system was substantially in operation by the end of 2020. The accident record is how the effect of PTC is judged: because the system targets a well-defined subset of human-factor causes, analysts can isolate exactly those cause codes in the data and ask whether the events PTC is meant to prevent have declined on PTC-equipped lines. The cause-code taxonomy makes that evaluation possible—without it, one could only measure total accidents, not the specific failures the technology addresses.

These threads connect to the broader regulatory agenda the dataset informs: debates over track inspection frequency and methods, which the track-cause family speaks to; over crew size and the two-person-crew rule, which the human-factor family informs; and over the wayside detectors and tank-car designs that the equipment-cause and hazmat fields illuminate. In each case the argument is ultimately about what the accident record shows, which is why custody and integrity of that record is itself a matter of public consequence.

Joining to the grade-crossing inventory

The rail-accidents table is most powerful when joined to the FRA's other safety datasets, and the most important of those joins is to the highway-rail grade-crossing inventory. A large and distinct share of rail casualties happens not on the open line but where the railroad crosses a road, and those events are recorded with a grade-crossing identifier—the standardized DOT-AAR crossing number assigned to every public and private highway-rail crossing in the country. That same identifier is the primary key of the crossing inventory, the FRA's master register of every crossing's physical and operational characteristics: its location, the roadway and the railroad that meet there, the number of tracks, the train and highway traffic counts, and—critically—the warning devices present, from a bare crossbuck sign to flashing lights to gates.

Joining the accidents to the inventory on the crossing identifier is what turns a grade-crossing collision from an isolated event into a question about infrastructure. It lets an analyst ask whether crossings with passive warning (signs only) suffer collisions at a higher rate than those with active warning (lights and gates), whether collision risk scales with train and highway traffic, and which specific crossings—by their accident history relative to their exposure—most warrant the investment of a gate or a grade separation. This is the empirical core of the federal grade-crossing safety program: the Section 130 program that funds crossing improvements relies on exactly this kind of accidents-to-inventory join to prioritize where limited safety dollars should go. Without the inventory, a grade-crossing accident is a casualty count; with it, the accident is anchored to a known piece of infrastructure with known warning devices and known exposure, which is the only way to reason about prevention.

Analytical uses

A national, event-resolved, cause-coded, decades-deep record of railroad accidents supports a distinctive set of analyses, most of which the cause code and the dollar-damage field make possible.

Derailment and hazmat trend analysis is the headline use. Because every event carries a date, a type, a consist, and a count of hazmat cars that released, an analyst can trend derailments over time, separate the catastrophic hazmat events from the routine ones, and test whether the rate of consequential derailments is rising or falling once exposure (train-miles, car-miles) is accounted for. The necessary discipline—developed in the caveats—is to normalize by exposure and to remember the moving reporting threshold before reading any long-run count as a clean safety signal.

Cause-code root-cause analysis exploits the taxonomy directly: aggregating events by cause family and by specific code reveals where the network's failures concentrate—whether the recurring problem on a given railroad is track defects, equipment failures, or operating errors—and how that mix differs by railroad, by region, and by accident type. This is the diagnostic that tells a regulator or a railroad where to direct inspection, maintenance, or training. PTC and intervention evaluation narrows that to the specific cause codes a technology or rule is meant to address, measuring whether the targeted failures have declined where the intervention applies. Railroad and corridor benchmarkinguses the reporting-railroad field to compare carriers—again, only sensibly when normalized by each railroad's exposure—and to identify the corridors and crossings with disproportionate accident histories. And the grade-crossing-safety analysis already described joins accidents to the inventory to drive crossing-improvement prioritization. Across all of these, the through-line is that the dataset converts a stream of individual accidents into a measurable, comparable, cause-attributed picture of systemic risk.

Python workflow: accidents by cause and railroad from the FRA download files

The FRA Office of Safety Analysis publishes the accident/incident data through its public site at safetydata.fra.dot.gov as downloadable extract files—no API key required. From the download pages you select the file type, the report year, and the scope (the national file or a single state); the classic download is a self-extracting archive that expands to a dBASE (DBF) file, and the December 2024 full-dataset release also offers flat CSV exports. The script below reads a downloaded rail-equipment (Form 54) accident extract, resolves the (terse and version-dependent) column names defensively, and computes three of the core views: events by reporting railroad, events grouped into cause families by the leading character of the cause code, and a derailment-by-year trend with reported damage. Because the FRA column codes are short and shift between releases, the loader discovers the working column names at runtime rather than hard-coding them, and any production use should be validated against the current FRA file-layout documentation and should loop over report years for a full panel.

import io, zipfile
import requests
import pandas as pd
from collections import Counter

# FRA Office of Safety Analysis -- public accident/incident downloads.
# No API key required. The Office of Safety Analysis publishes the
# Railroad Accident/Incident Reporting System (the Form 6180.54
# rail-equipment accidents) as downloadable extracts, selected by file
# type, year, and scope (national or a single state). The classic
# download is a self-extracting ZIP wrapping a dBASE (.dbf) file; the
# December 2024 "full datasets" release also offers flat CSV exports.
# Pick the appropriate accident extract from the public download pages
# and read it locally, or stream a downloaded extract URL:
#   https://safetydata.fra.dot.gov/officeofsafety/publicsite/downloads/downloads.aspx
#   https://safetydata.fra.dot.gov/officeofsafety/publicsite/on_the_fly_download.aspx
# The terse column codes change between releases, so the loader resolves
# columns defensively rather than hard-coding them.


def load_accidents(source):
    # 'source' is a path or URL to a downloaded accident extract: either a
    # flat CSV, a bare .dbf, or a ZIP wrapping one of those. Sniff the
    # payload and read whichever form arrived.
    if source.startswith("http"):
        r = requests.get(source, timeout=600)
        r.raise_for_status()
        blob = r.content
    else:
        with open(source, "rb") as fh:
            blob = fh.read()

    if blob[:2] == b"PK":                         # ZIP magic number
        zf = zipfile.ZipFile(io.BytesIO(blob))
        names = zf.namelist()
        flat = [n for n in names if n.lower().endswith((".csv", ".txt"))]
        if flat:
            with zf.open(flat[0]) as fh:
                return pd.read_csv(fh, dtype=str, low_memory=False)
        dbf = [n for n in names if n.lower().endswith(".dbf")]
        if not dbf:
            raise RuntimeError("no accident file inside the ZIP archive")
        from dbfread import DBF                    # pip install dbfread
        tmp = io.BytesIO(zf.read(dbf[0]))
        return pd.DataFrame(iter(DBF(tmp, char_decode_errors="ignore")))
    if blob[:1] in (b"\x03", b"\x30", b"\x83"):  # dBASE header bytes
        from dbfread import DBF
        return pd.DataFrame(iter(DBF(io.BytesIO(blob),
                                    char_decode_errors="ignore")))
    return pd.read_csv(io.BytesIO(blob), dtype=str, low_memory=False)


def col(frame, *needles):
    # Return the first column whose name contains all of the needles.
    for c in frame.columns:
        u = str(c).upper()
        if all(n.upper() in u for n in needles):
            return c
    return None


def analyze(source):
    df = load_accidents(source)
    df.columns = [str(c) for c in df.columns]
    print(f"Accident/incident records loaded: {len(df):,}")

    c_rr    = col(df, "RAILROAD") or col(df, "RR")
    c_year  = col(df, "YEAR") or col(df, "IYR")
    c_cause = col(df, "CAUSE")
    c_type  = col(df, "TYPE") or col(df, "ACCIDENT", "TYPE")
    c_dmg   = col(df, "ACCDMG") or col(df, "TOTAL", "DAMAGE") or col(df, "DAMAGE")
    c_kld   = col(df, "TOTKLD") or col(df, "KILLED")

    # --- 1. Events by reporting railroad --------------------------------
    print("\nTop 12 reporting railroads by event count:")
    for rr, n in df[c_rr].fillna("(blank)").value_counts().head(12).items():
        print(f"  {str(rr)[:38]:<38} {n:>7,}")

    # --- 2. Events by cause code ----------------------------------------
    # Cause codes group into human-factor (H), track (T), equipment (E),
    # signal (S), and miscellaneous (M) families by their leading letter.
    fam = Counter()
    for raw in df[c_cause].fillna(""):
        head = str(raw).strip()[:1].upper()
        fam[{"H": "human factors", "T": "track", "E": "equipment",
             "S": "signal", "M": "miscellaneous"}.get(head, "other/blank")] += 1
    print("\nEvents by cause family:")
    for family, n in fam.most_common():
        print(f"  {family:<16} {n:>7,}")

    # --- 3. Derailment trend by year ------------------------------------
    # Accident type "01" is a derailment in the Form 54 type taxonomy.
    if c_year and c_type:
        der = df[df[c_type].fillna("").astype(str).str.zfill(2).str.startswith("01")]
        by_year = der[c_year].dropna().astype(str).str[-4:].value_counts()
        print("\nDerailments by year (last 10 reported years):")
        for yr in sorted(by_year.index)[-10:]:
            n = int(by_year[yr])
            dmg = ""
            if c_dmg:
                tot = pd.to_numeric(der[der[c_year].astype(str).str[-4:] == yr][c_dmg],
                                    errors="coerce").sum()
                dmg = f"  (${tot:,.0f} reported damage)"
            print(f"  {yr}: {n:>5,} derailments{dmg}")
    return df


# Point at an accident extract downloaded from the FRA Office of Safety
# Analysis public download pages (above). Loop the report year for a panel.
analyze("fra_accidents_2024.dbf")

Two practical notes apply. First, the cause-family grouping in the script is a deliberate simplification: bucketing on the leading character of the cause code is a fast first pass, but a rigorous analysis should map the full code to the FRA's published cause-code dictionary, which distinguishes dozens of specific mechanisms within each family and is what lets a PTC-targeted subset be isolated precisely. Second, raw event counts are almost never the right denominator-free comparison. Before ranking railroads or declaring a trend, the counts should be normalized by exposure—train-miles or car-miles, which the FRA publishes separately—and the moving reporting threshold should be accounted for, because an apparent rise or fall in events can reflect a change in the dollar bar rather than a change in safety. The script computes the raw quantities; turning them into defensible safety metrics is the analytical work that follows.

Limitations and analytical caveats

The rail-accidents dataset is the most comprehensive public record of railroad accidents in the United States, but it carries structural limitations that an analyst must internalize before drawing conclusions from it.

The data is self-reported by the railroads. The FRA compiles the record, but it is the regulated railroads themselves that prepare and file the reports, assign the cause codes, and estimate the damage. The system depends on the railroads reporting accurately and completely, and while Part 225 makes under-reporting a serious violation and the FRA audits railroad records, the structural fact remains that the reporter is the party whose safety performance the data measures. Historical concerns about under-reporting—particularly of injuries that railroads might have an incentive not to record—have prompted FRA audits and enforcement, and an analyst should treat the data as authoritative but not assume it is perfectly complete, especially at the margins of what qualifies as reportable.

The reporting threshold moves, which corrupts naive long-run trends. Because a rail-equipment accident is reportable only when its damage exceeds a dollar threshold, and because the FRA raises that threshold over time to keep pace with inflation in equipment and repair costs, the population of events in the dataset is not defined by a constant criterion. A given derailment near the margin might be reportable in one era and not in another purely because of where the bar sits. Any analysis that trends event counts across many years—especially counts of lower-severity events—is therefore partly measuring the history of the threshold, not only the history of railroad safety. Trends in the most severe events (fatalities, major hazmat releases) are far more robust to this problem because they clear any plausible threshold; trends in total reportable accidents are the most exposed to it.

Cause codes are judgments, and they are coarse. The cause code is the most analytically valuable field, but it is an attribution—a coded judgment about why the accident happened—made by the reporting railroad, sometimes under conditions of incomplete information and sometimes with an interest in how the cause is characterized. Two railroads, or the same railroad in two eras, may code similar events differently. And the single primary cause code compresses what is often a chain of causation into one label; the contributing-cause field helps but does not fully capture the multi-factor reality of many accidents. Treating the cause code as an objective physical fact, rather than as a structured expert judgment with its own variability, over-reads what the field can bear—cross-railroad and cross-era cause comparisons should be made with this firmly in mind.

Counts without exposure mislead, and incidents are not all accidents. A raw count of events by railroad reflects, more than anything, how much railroading each carrier does: a large Class I railroad will have more accidents than a short line simply because it runs vastly more trains over vastly more track. Meaningful comparison requires normalizing by exposure—train-miles, car-miles, or track-miles. Relatedly, the broader reporting system mixes true accidents (the Form 54 events this dataset centers on) with other reportable incidents and casualties recorded on the companion forms; an analyst must be precise about which population a query is drawing from, because conflating rail-equipment accidents with all reportable incidents inflates and distorts any rate.

Held with these caveats in mind, the fra_rail_accidents table is a uniquely valuable resource: an event-resolved, cause-coded, casualty-counted, damage-quantified record of roughly 224,000 reportable railroad accidents stretching back to 1975—the federal ledger that turns every derailment, collision, and grade-crossing strike into a row that can be counted, compared, and traced to a cause, and the evidentiary foundation on which the country argues about how to make its railroads safer.

Related writing

FRA Highway-Rail Grade Crossing Inventory: The Federal Database Behind 250,000 Railroad Crossings — The inventory is the other half of grade-crossing safety analysis: every grade-crossing collision in the accidents file carries the DOT-AAR crossing identifier that joins it to the inventory's record of warning devices, traffic, and physical layout, turning a casualty count into a question about infrastructure.

NTSB Aviation Accident Database: The Federal Record Behind Every US Aircraft Accident Investigation — The aviation counterpart to the rail record, built around independent NTSB investigation and probable-cause findings, and a useful contrast in how a different mode codes cause and severity for the same root question of why transportation accidents happen.

FMCSA Crash Data: The Federal Database Behind Large Truck and Bus Crashes — The highway-freight analogue: where the FRA records rail accidents by reporting railroad and cause code, FMCSA records large-truck and bus crashes by carrier, and together they map the safety record of the two surface modes that move the nation's freight.