Technical writing

FRA Railroad Accident Data: The Federal Database Behind Every US Rail Incident Since 1975

· AI Analytics
FRARailroadRail SafetyTransportation SafetyFederal Data

Since 1975, every significant railroad accident in the United States has generated a federal record filed with the Federal Railroad Administration. The FRA Safety Data system now holds more than 224,000 accident and incident reports spanning half a century of American rail operations—from Class I freight giants like BNSF and Union Pacific down to heritage tourist lines and industrial switching operations. Understanding what that database contains, how its reporting forms are structured, and what the data reveals about rail safety trends is the subject of this guide.

The FRA and the federal railroad safety framework

The Federal Railroad Administration is the modal safety regulator for US railroad operations, housed within the Department of Transportation. FRA administers the Federal Railroad Safety Act, which grants it authority over all aspects of railroad safety: track geometry, locomotive equipment, signal systems, operating practices, and hazardous materials transportation. That authority extends to every railroad operating on the US national rail network, regardless of size—from Class I freight carriers moving hundreds of millions of tons of freight annually to the smallest heritage railroad.

FRA's public data interface is the FRA Safety Data website at safetydata.fra.dot.gov. The portal provides web-based query access, bulk data downloads, and a REST API for programmatic access to accident records, grade crossing inventory data, employee injury statistics, and inspection records. All accident reporting obligations flow from 49 CFR Part 225, the federal railroad accident reporting regulations, which require railroads to file structured reports within 30 days of any qualifying incident.

The FRA accident database is functionally separate from the National Transportation Safety Board. FRA collects primary accident data from reporting railroads as part of its regulatory compliance function. NTSB independently investigates significant accidents—those involving multiple deaths, evacuations, or national significance— and issues probable cause findings. For any major rail accident, the FRA Form 54 record is the primary federal data point; the NTSB investigation report is the authoritative causal analysis. The two should be read together.

The three reporting forms: Form 54, Form 57, Form 55

The FRA accident database is organized around three primary reporting forms, each capturing a distinct category of railroad safety event under 49 CFR Part 225.

Form FRA F 6180.54 — Rail Equipment Accident/Incident Report is the foundational train accident report. Form 54 captures every qualifying train accident: derailments, collisions between trains, collisions between trains and highway vehicles at non-crossing locations, fires, explosions, and other rail equipment accidents. The reportability threshold is a property damage floor of $11,200 (adjusted annually for inflation; the 2024 figure), any death or injury meeting the definition of reportable injury, the evacuation of 50 or more people, or the release of hazardous materials in reportable quantities. An accident below the property damage threshold is still reportable if it produces any of those other consequences. Coverage begins in 1975 and the dataset is updated monthly.

Form FRA F 6180.57 — Highway-Rail Grade Crossing Accident/Incident Report covers collisions between trains and highway users at the approximately 128,000 public grade crossings documented in the FRA Grade Crossing Inventory. Form 57 has no minimum property damage threshold—any grade crossing incident resulting in injury, death, or damage is reportable. Each Form 57 record links by the DOT USDOT crossing number to the Grade Crossing Inventory, which documents the physical characteristics, warning device type, and traffic volume of every public crossing. That linkage is the analytical foundation for crossing risk analysis.

Form FRA F 6180.55 — Railroad Injury and Illness Summary captures employee injuries and occupational illnesses separately from train accidents. An employee injured during a derailment will appear in both the Form 54 record (as a casualty) and the Form 55 record (as an employee injury). Form 55 also captures injuries and illnesses that occur in railroad operations without any accompanying train accident—a trackworker struck by a maintenance-of-way vehicle, a locomotive engineer injured during coupling operations, an employee developing an occupational illness from chemical exposure. The companion Form 55A provides individual-level employee casualty detail for the most severe incidents.

Train accident classification: cause codes and accident types

Form 54 records carry two primary analytical classification fields: the accident type and the primary cause code. Together they describe what happened and why.

Accident types in the Form 54 database break into four broad categories. Derailments—where one or more cars or locomotives leave the rails—are the most common accident type by count, typically representing 60 to 70 percent of all Form 54 records in any given year. Collisions, which include train-to-train impacts, train-to-vehicle collisions at non-crossing locations, and train-to-object collisions, are the second category. Fires and explosions form a third category, often overlapping with derailments where fuel or hazardous cargo ignites. The fourth category, “other,” covers events that meet reportability thresholds without fitting the prior classifications.

Primary cause codes are five-character alphanumeric codes assigned by the reporting railroad under FRA's structured cause taxonomy. The taxonomy organizes causes into four primary groups. Track causes cover physical defects in track infrastructure: geometry failures (cross level, surface, alignment, joint), broken rail, rail defects, tie deterioration, and roadbed failures. Equipment causes cover mechanical failures in rolling stock: wheel failures, axle failures, bearing failures (the failure mode at East Palestine), coupler failures, and brake system defects. Human factor causes cover operator error: train handling errors, excessive speed, signal violations, improper switch alignment, and crew communication failures. Miscellaneous causes cover everything that does not fit the preceding categories, including weather events (flash floods undermining track subgrade, track buckling from extreme heat) and obstructions on the right-of-way.

Each cause code is assigned by the reporting railroad, not by an independent investigator. This creates a systematic bias worth noting for analytical work: railroads have structural incentives to assign causes to track defects or equipment failures rather than to human factors, since human factor coding creates more direct liability exposure for operating practices. NTSB investigations of major accidents have repeatedly found human factor contributions that were not reflected in the initial FRA Form 54 cause code assignment. For trend analysis across the full 224,000-record dataset, the cause codes are analytically productive; for individual incident investigation, cross-referencing NTSB reports is essential.

East Palestine and the hazmat context

On February 3, 2023, Norfolk Southern train 32N derailed near East Palestine, Ohio, 38 cars leaving the rails on a mainline stretch of Class 4 track. Eleven of the derailed cars were carrying hazardous materials: vinyl chloride, butyl acrylate, ethylhexyl acrylate, and other chemicals. Five days after the derailment, Norfolk Southern and emergency managers made the decision to conduct a controlled burn of the vinyl chloride to prevent a potential uncontrolled explosion, releasing phosgene and hydrogen chloride into the surrounding area. More than 2,000 residents evacuated. Local waterways were contaminated. The incident drew sustained national attention to hazardous materials transport risk on the freight rail network.

The National Transportation Safety Board's investigation found the probable cause to be a wheel bearing failure on car 23 of the train. Wayside hot box detectors had recorded elevated temperatures on that bearing at multiple points along the route, but the alert thresholds in the applicable Norfolk Southern protocol were not triggered at levels that would have stopped the train before the bearing failed catastrophically. The NTSB investigation generated 37 safety recommendations directed at FRA, Norfolk Southern, and the Association of American Railroads, including specific changes to hot box detector alert thresholds and response protocols. FRA issued an Emergency Order requiring immediate review and tightening of hot bearing detector protocols across the industry.

The East Palestine derailment was exceptional in its public visibility and political consequence, but it was not exceptional in its basic mechanics. FRA Form 54 data covering the decade before East Palestine surfaces several hundred derailments involving hazardous materials releases, caused by equipment failures including bearing failures. The FRA tracks hazardous materials involvement specifically: Form 54 records include fields for the number of cars in the train loaded with hazmat, the number of hazmat cars that released their contents, and the volume released. Annual hazmat release incidents have averaged roughly 180 to 220 per year over the past decade. Despite that frequency, most hazmat release incidents involve small quantities of low-hazard commodities; East Palestine was extreme in the combination of toxic commodities, release volume, and proximity to a residential community.

The longer arc of the safety record provides important context. Class I railroad train accident rates, measured as accidents per million train miles operated, fell from approximately 2.5 in the mid-1970s to approximately 0.9 by 2022—a roughly 65 percent reduction over five decades, even as freight volumes grew substantially. The improvement is real and attributable to identifiable technological changes: wayside detector networks, continuous rail flaw detection, and, most recently, Positive Train Control. East Palestine does not erase that trend. It does illustrate that the remaining accident rate, however low in relative terms, still produces significant incidents when applied to the scale of the US freight network.

Grade crossing incidents: 2,000 collisions per year

Highway-rail grade crossing incidents represent a category of rail safety risk that is structurally different from train accidents. In a train accident, the initiating failure—a rail defect, a bearing failure, a signal violation— occurs within the railroad system. In a grade crossing incident, the initiating event is typically a highway user entering the crossing when a train is present or approaching. The railroad is often not at fault in any regulatory sense, but the consequences fall on highway users.

The FRA records roughly 2,000 to 2,200 highway-rail grade crossing collisions per year. Approximately 270 to 290 people die annually in those collisions—a figure that, while a fraction of the 40,000-plus annual US traffic fatalities, represents a significant and preventable toll. The Form 57 data makes clear that the risk is heavily concentrated at passive crossings—those with no active warning devices, or with crossbuck signs only—and at crossings where sight distance is obstructed by vegetation, structures, or terrain.

The FRA Grade Crossing Inventory documents every public and private grade crossing in the United States—approximately 128,000 public crossings in total. Each crossing record includes the warning device type (no device, crossbucks, stop sign, flashing lights, flashing lights with gates, or constant warning), number of tracks, approach grade, sight distance, and annual average daily traffic. The warning device field is the single most powerful predictor of crossing accident severity: crossings with automatic gates have dramatically lower collision rates per vehicle crossing than passive crossings, and lower fatality rates among those collisions that do occur.

The top states by grade crossing incident count follow closely from which states have the most at-grade crossings: Texas, Illinois, California, Georgia, and Louisiana have historically led the Form 57 counts. FRA's federal grade crossing improvement programs, including the Consolidated Rail Infrastructure and Safety Improvements (CRISI) grant program funded at over $1 billion under the Infrastructure Investment and Jobs Act of 2021, direct funding toward upgrading passive crossings on high-traffic corridors. Operation Lifesaver, the federally supported public education program, focuses on driver behavior at crossings. Quiet Zones, established under a separate FRA regulatory process, allow communities to apply to designate crossings where trains are not required to sound the locomotive horn, provided the crossing meets specific supplementary safety measure requirements.

Positive Train Control and the regulatory safety record

FRA tracks three primary safety performance metrics: the train accident rate per million train miles, the employee injury rate per 200,000 employee hours, and the grade crossing incident rate per million train miles. These metrics are published annually for Class I railroads and provide the longitudinal benchmark against which safety improvement claims are evaluated.

The most significant single regulatory safety technology of the past two decades is Positive Train Control. PTC is a system of GPS positioning, digital communications, and onboard computers that automatically enforces speed restrictions, stop signals, and broken-rail protections without depending on crew compliance. If a train approaches a stop signal at excessive speed and the engineer fails to brake, PTC intervenes and applies the brakes automatically. If a train enters a work zone where the track is occupied by maintenance crews, PTC stops the train before it reaches the occupied segment.

The mandate for PTC came from the Rail Safety Improvement Act of 2008, passed in the wake of the September 2008 Chatsworth, California collision in which a Metrolink commuter train traveling at 40 mph struck a Union Pacific freight train that had a clear signal. Twenty-five people died. The NTSB investigation found that the Metrolink engineer had been texting on his personal cell phone and missed a stop signal. PTC would have stopped the train. The political pressure from Chatsworth drove the 2008 mandate; the statutory deadline for full implementation was extended multiple times before Class I railroads achieved compliance by 2020.

The FRA PTC implementation database tracks the installation and certification status of PTC systems across the national network. With Class I implementation complete, the Form 54 data from 2020 onward provides the first systematic test of PTC's preventive effect at scale—specifically, whether accident rates in the signal and human factors cause categories declined faster after 2020 than the prior trend would have predicted.

FRA regulatory enforcement: inspections and civil penalties

FRA's enforcement authority operates through safety inspections conducted across five disciplines: track, locomotive and equipment, signal and train control, operating practices, and hazardous materials. Federal inspectors and approved state inspectors conduct approximately 140,000 safety inspections annually, citing roughly 28,000 violations. The violation data is published separately from the accident data and provides a regulatory compliance picture that complements the accident record.

Civil monetary penalties for railroad safety violations can reach $27,904 per violation as of 2024 (indexed annually to inflation), with additional per-day penalties for continuing violations. FRA's Safety Assurance and Compliance Program (SACP) uses risk-based inspection targeting to concentrate inspection resources on railroads and track segments with elevated accident histories or prior violation patterns. The gap between proposed and assessed civil penalties in railroad enforcement is significant, as it is in most federal safety enforcement programs—assessed penalties are typically negotiated downward from proposed amounts.

FRA also administers the CRISI grant program for grade crossing improvements and rail safety technology deployment, distributing capital funding to state and local governments and railroads for infrastructure upgrades that go beyond what the civil penalty framework can compel. The CRISI program has become a primary vehicle for accelerating PTC deployment on commuter and short-line railroads that were excluded from the 2020 Class I mandate.

NTSB as the complementary investigative database

The National Transportation Safety Board occupies a distinct role in the federal railroad safety data ecosystem. NTSB does not regulate railroads—it has no enforcement authority. Its mandate is independent accident investigation: determining probable cause, identifying contributing factors, and issuing safety recommendations to the agencies and organizations with the authority to act.

NTSB investigates railroad accidents that meet significance thresholds: multiple deaths, mass evacuations, or accidents of national public interest. For each investigation, NTSB publishes a detailed public report covering the accident sequence, factual findings, probable cause determination, and safety recommendations. These reports are available at ntsb.gov/investigations and represent the most analytically rigorous public source of causal information for major rail accidents. NTSB's railroad and pipeline accident database at ntsb.gov/safety/data-and-statistics provides structured access to investigation records.

The relationship between FRA and NTSB data is complementary in a specific way. FRA Form 54 gives you breadth: 224,000 records with structured fields enabling trend analysis across the full accident population. NTSB gives you depth: detailed causal narratives for the subset of accidents significant enough to warrant full investigation. The East Palestine investigation generated 37 NTSB safety recommendations to FRA, more than any single railroad accident in recent memory— a measure of the incident's significance and the scope of the systemic issues the investigation surfaced.

FRA Safety Data API: programmatic access

The FRA Safety Data portal at safetydata.fra.dot.gov provides a public REST API for programmatic access to accident, grade crossing, and employee injury data. The base endpoint is https://safetydata.fra.dot.gov/OfficeofSafety/publicsite/api/. The API supports both JSON and XML responses. No API key is required for the public endpoints.

The primary accident query endpoint is GetAccidents, which accepts parameters including fromYear and toYear for date range filtering, reportingRailroad for carrier-level queries using the railroad reporting mark, state for two-letter state filtering, and accidentType for filtering by accident classification. Companion endpoints include GetGradeCrossing for Form 57 grade crossing records and GetEmployeeRailroadInjuries for Form 55 employee injury data.

The API is suitable for targeted queries and monitoring applications. For full historical analysis spanning the complete 1975–present range, bulk CSV downloads from the Safety Data portal downloads page are more efficient. Annual CSV files are published separately for Form 54, Form 57, Form 55, and Form 55A. Column layouts are consistent across modern-era files; pre-1990 files use a legacy schema that requires a separate data dictionary. The bulk files are the recommended format for building a local research database for longitudinal trend analysis.

Python: querying the FRA API for derailment and hazmat trends

The following Python script demonstrates the core analysis workflow for FRA accident data: querying the Safety Data API for derailment records over the last five years, grouping by state, normalizing against AAR lane-miles data for rate computation, separately querying for hazmat release incidents, and identifying the most frequently released commodity. The script handles API pagination and missing data fields.

import requests
import pandas as pd
from collections import Counter

# ---------------------------------------------------------------------------
# FRA Railroad Accident Data Analysis
#
# Primary data source:
#   FRA Safety Data API at safetydata.fra.dot.gov
#   Endpoint: https://safetydata.fra.dot.gov/OfficeofSafety/publicsite/api/
#   Supports JSON and XML responses with state, railroad, year, and
#   accident-type filters. No API key required for public endpoints.
#
# Bulk CSV downloads (preferred for historical analysis):
#   https://safetydata.fra.dot.gov/OfficeofSafety/publicsite/Downloads.aspx
#   Annual files: Form 54 (train accidents), Form 57 (grade crossing),
#   Form 55 (employee injuries), Form 55A (employee casualty detail)
# ---------------------------------------------------------------------------

BASE_URL = "https://safetydata.fra.dot.gov/OfficeofSafety/publicsite/api/"

def get_accidents(
    from_year: int,
    to_year: int,
    accident_type: str | None = None,
    state: str | None = None,
    railroad: str | None = None,
    page: int = 1,
    page_size: int = 1000,
) -> list[dict]:
    """
    Query the FRA Safety Data API for train accident records.
    accident_type: "D" for derailment, "C" for collision, etc.
    state: two-letter abbreviation, e.g. "OH", "TX"
    railroad: reporting mark, e.g. "BNSF", "NS", "UP"
    Returns a list of accident record dicts.
    """
    params: dict = {
        "fromYear": from_year,
        "toYear": to_year,
        "format": "json",
        "page": page,
        "pageSize": page_size,
    }
    if accident_type:
        params["accidentType"] = accident_type
    if state:
        params["state"] = state
    if railroad:
        params["reportingRailroad"] = railroad

    url = BASE_URL + "GetAccidents"
    resp = requests.get(url, params=params, timeout=30)
    resp.raise_for_status()
    data = resp.json()
    # API returns {"result": [...], "totalCount": N}
    return data.get("result", [])


def get_all_accidents(
    from_year: int,
    to_year: int,
    accident_type: str | None = None,
    state: str | None = None,
    railroad: str | None = None,
) -> list[dict]:
    """
    Paginate through FRA API results to fetch all matching accident records.
    Stops when a page returns fewer records than page_size.
    """
    page_size = 1000
    all_records: list[dict] = []
    page = 1
    while True:
        batch = get_accidents(
            from_year=from_year,
            to_year=to_year,
            accident_type=accident_type,
            state=state,
            railroad=railroad,
            page=page,
            page_size=page_size,
        )
        if not batch:
            break
        all_records.extend(batch)
        if len(batch) < page_size:
            break
        page += 1
    return all_records


# --- Step 1: Fetch derailment records for the last 5 years ---
# FRA accident type code "D" = derailment
current_year = 2026
derailments = get_all_accidents(
    from_year=current_year - 5,
    to_year=current_year,
    accident_type="D",
)
print(f"Fetched {len(derailments):,} derailment records ({current_year - 5}–{current_year})")

# --- Step 2: Group derailments by state ---
# Each record has a "state" field with the 2-letter abbreviation.
state_counts: Counter = Counter()
for rec in derailments:
    st = (rec.get("state") or "").strip().upper()
    if st:
        state_counts[st] += 1

derail_by_state = pd.DataFrame(
    [(st, cnt) for st, cnt in state_counts.most_common()],
    columns=["state", "derailments"],
)

# --- Step 3: Normalize by AAR lane-miles for rate computation ---
# AAR (Association of American Railroads) publishes freight rail miles
# by state in their annual Freight Rail State Fact Sheets.
# Representative lane-miles data (thousands of miles, approximate):
aar_lane_miles = {
    "TX": 10_425, "IL": 7_182, "KS": 6_627, "NE": 6_556, "MT": 5_983,
    "ND": 4_903, "WY": 4_163, "CA": 4_132, "MO": 3_978, "CO": 3_928,
    "OK": 3_743, "MN": 4_581, "IA": 4_267, "OH": 3_367, "PA": 3_145,
    "GA": 2_897, "WA": 3_012, "VA": 2_762, "KY": 2_632, "AL": 2_561,
}

derail_by_state["lane_miles_k"] = derail_by_state["state"].map(aar_lane_miles)
derail_by_state["rate_per_1000_miles"] = (
    derail_by_state["derailments"] / derail_by_state["lane_miles_k"].fillna(1)
)

top10_count = derail_by_state.nlargest(10, "derailments").reset_index(drop=True)
top10_rate = (
    derail_by_state.dropna(subset=["lane_miles_k"])
    .nlargest(10, "rate_per_1000_miles")
    .reset_index(drop=True)
)

print("\nTop 10 States by Derailment Count")
print(f"{'State':<6}  {'Derailments':>11}")
print("-" * 22)
for _, row in top10_count.iterrows():
    print(f"{row['state']:<6}  {int(row['derailments']):>11,}")

print("\nTop 10 States by Derailment Rate (per 1,000 lane-miles)")
print(f"{'State':<6}  {'Derailments':>11}  {'Rate':>8}")
print("-" * 32)
for _, row in top10_rate.iterrows():
    miles_k = row["lane_miles_k"]
    rate = row["rate_per_1000_miles"]
    print(
        f"{row['state']:<6}  {int(row['derailments']):>11,}"
        f"  {rate:>8.3f}"
    )

# --- Step 4: Fetch hazmat release accidents ---
# FRA Form 54 tracks hazmat involvement with specific accident type codes.
# Use type "H" for hazardous materials release events.
hazmat_records = get_all_accidents(
    from_year=current_year - 5,
    to_year=current_year,
    accident_type="H",
)
print(f"\nFetched {len(hazmat_records):,} hazmat-release records ({current_year - 5}–{current_year})")

# --- Step 5: Find most frequently released commodity ---
# Each hazmat record may include a commodity/chemical field.
# Field name varies by API version; check "commodity", "hazmatName", or "chemName".
commodity_counts: Counter = Counter()
for rec in hazmat_records:
    commodity = (
        rec.get("commodity")
        or rec.get("hazmatName")
        or rec.get("chemName")
        or ""
    ).strip().title()
    if commodity and commodity not in {"", "Unknown", "N/A"}:
        commodity_counts[commodity] += 1

if commodity_counts:
    print("\nTop 10 Most Frequently Released Hazmat Commodities")
    print(f"{'Commodity':<35}  {'Incidents':>9}")
    print("-" * 48)
    for commodity, count in commodity_counts.most_common(10):
        print(f"{commodity:<35}  {count:>9,}")
else:
    print("\nNo commodity-level detail available in API response.")
    print("For commodity analysis, use the bulk Form 54 CSV download and")
    print("join to the PHMSA hazmat incident database on incident date/location.")

The derailment-by-state count analysis reveals geographic concentration: states with the most railroad lane-miles and the most freight traffic—Texas, Illinois, Nebraska, Kansas—lead in absolute derailment counts. Rate normalization shifts the picture: some smaller states with dense branch-line networks and older infrastructure show elevated rates relative to their network size. The rate normalization using AAR lane-miles data is an approximation; a more rigorous analysis would use FRA train-miles-operated statistics by state and railroad, available in FRA's annual safety statistics reports.

The hazmat commodity analysis illustrates a common data quality challenge in federal safety databases: commodity-level detail may be available in bulk CSV files but not fully surfaced through the API. When the API returns incomplete commodity fields, joining Form 54 bulk CSV records to the PHMSA hazardous materials incident database on incident date and location provides the most complete picture of what was released and in what quantities.

For the OPM federal workforce database—how FedScope, the Central Personnel Data File, and annual employment reports document 2.1 million civilian federal jobs across agencies including DOT and FRA itself, with GS pay table structure, FERS and CSRS retirement data, and a Python walkthrough of the OPM public CSV files—see OPM Federal Workforce Data: The Personnel Records Behind 2.1 Million Civilian Federal Jobs.

For the FMCSA crash database—the federal counterpart covering commercial motor vehicle safety, with 500,000 annual reportable truck crashes, the CSA Safety Measurement System, Hours of Service regulations, the ELD mandate, and a Python example querying the FMCSA public API for carrier-level crash records—see FMCSA Crash Data: The Federal Database Behind 5,000 Annual Large Truck Fatalities.