Technical writing

Derailments and grade crossings: using FRA railroad accident data to analyze rail safety trends

· 13 min read· AI Analytics
Regulatory dataFRARailroad safetyDerailmentsGrade crossingsTransportation

The Federal Railroad Administration has collected structured accident reports for every significant railroad incident in the United States since 1975. Two linked databases—Form 54, covering all railroad accidents and incidents, and Form 57, covering highway-rail grade crossing collisions—together hold more than 250,000 records documenting derailments, collisions, fires, explosions, and vehicle strikes at road crossings. The data includes train type and speed at the moment of accident, track class, primary cause code, hazardous materials involvement, and casualty counts. What it reveals about derailment failure modes, grade crossing risk factors, and a five-decade arc of safety improvement is the subject of this piece.

What the FRA accident databases cover: Form 54 and Form 57

The Federal Railroad Administration is the modal federal railroad safety regulator within the Department of Transportation, responsible for safety oversight of approximately 140,000 miles of freight and passenger railroad in the United States. FRA administers safety standards covering track geometry, locomotive equipment, operating practices, and signal systems. Its accident reporting requirements flow from 49 CFR Part 225, which mandates that railroads file structured reports for qualifying accidents within 30 days of occurrence. The two primary report forms serve distinct but related analytical purposes.

Form 54 — Railroad Accident/Incident Report is the primary FRA accident report, capturing all rail accidents that meet either of two thresholds: any accident causing at least one injury or death, or any accident causing property damage of $10,800 or more (the threshold is inflation-adjusted periodically; it was lower in earlier years of the dataset). Form 54 covers the full taxonomy of rail accident types: derailments, collisions between trains, collisions with highway vehicles at non-crossing locations, fires, explosions, and obstructions. Coverage begins in 1975 and the database is updated monthly as railroads file reports.

Form 57 — Highway-Rail Grade Crossing Accident/Incident Report covers collisions between trains and highway users—automobiles, trucks, motorcycles, cyclists, and pedestrians—at the approximately 130,000 public and private highway-rail grade crossings in the United States. Form 57 reporting is required for any grade crossing incident resulting in injury, death, or damage, with no minimum property damage threshold. Coverage also begins in 1975. The grade crossing database links by crossing identifier (the DOT USDOT number) to the National Highway-Rail Crossing Inventory, a companion dataset that documents the physical and warning device characteristics of every public crossing.

The $10,800 damage threshold for Form 54 is the primary reporting gate for property-damage-only accidents. Incidents below this threshold—minor derailments with no injury and limited equipment damage, small fires that self-extinguished, or minor track damage—are not reported to FRA and do not appear in the national database. This creates a systematic selection floor in the Form 54 data, discussed further in the limitations section.

Form 54 data structure: what each railroad accident record contains

Form 54 records are submitted by the reporting railroad and processed into a structured relational table. The key analytical fields:

  • Railroad name and reporting mark — the railroad that owns or operates the track where the accident occurred. The reporting mark (a two-to-four-letter abbreviation such as BNSF, UP, CSX, NS, or AMTK) is more stable than the full name for longitudinal analysis, as railroad corporate names change through mergers and acquisitions while reporting marks often persist. The reporting mark is the most reliable join key across FRA datasets.
  • Accident date, time, and location — the precise date and time of the accident, plus state, county, and milepost location on the owning railroad's timetable. Milepost data enables geographic GIS analysis when joined to railroad network shapefiles published by the Bureau of Transportation Statistics. Location data is not GPS coordinates; it requires a join to the railroad network geometry to map.
  • Track class — the FRA track safety standards class (1 through 9) of the track segment where the accident occurred. Track class determines the maximum authorized operating speed: Class 1 (10 mph freight, 15 mph passenger) through Class 9 (the highest class, supporting speeds above 200 mph on approved high-speed corridors). Most mainline freight operates on Class 3, 4, or 5 track; branch lines and industrial spurs typically operate on Class 1 or 2. Track class is one of the most analytically useful stratification variables in the Form 54 database.
  • Train type — a classification of the train's operating purpose: freight, passenger, commuter, work train (maintenance of way), light locomotive (no cars), or other. Freight and passenger trains have systematically different accident profiles; comparing them without controlling for train type produces misleading aggregate statistics.
  • Train speed at accident — the estimated speed of the train at the moment the accident began, reported in miles per hour. Speed is a critical variable for both derailment severity analysis and grade crossing casualty analysis. High-speed impacts at grade crossings are dramatically more lethal than low-speed impacts; the relationship between speed and casualty count is nonlinear.
  • Primary cause code — FRA's structured classification of the proximate cause of the accident. The cause taxonomy is discussed in detail in the next section; it is the central analytical variable for failure pattern research.
  • Hazardous materials involvement — a flag indicating whether any car in the train was carrying hazardous materials (HAZMAT), and whether any HAZMAT release occurred as a result of the accident. Hazmat involvement is the field that links Form 54 to PHMSA hazardous materials transport data for incidents involving hazmat releases from rail cars.
  • Casualties — separate counts of railroad employee deaths, railroad employee injuries, passenger deaths, passenger injuries, and other deaths and injuries (including trespassers and bystanders). The disaggregation by person type is important: railroad employee injury rates are covered by FRA worker protection regulations and are independently tracked by the Bureau of Labor Statistics; collapsing them with passenger or public casualties produces analytically incoherent totals.
  • Equipment damage — estimated cost of damage to railroad equipment (locomotives, freight cars, passenger cars) and track infrastructure. Like PHMSA's property damage field, this is an operator self-estimate and does not capture consequential damages such as cargo loss, service interruption costs, or environmental remediation.

The primary cause taxonomy: six groups and hundreds of subcodes

FRA's cause code taxonomy organizes accident causes into six primary groups, each subdivided into specific cause codes. The taxonomy uses alphanumeric codes in the format of a letter prefix (identifying the primary group) followed by a three-digit subcode. The six primary cause groups:

  • Human factors (H) — codes H101 through H399 covering the full range of operator error and human performance failures: failure to comply with signal indication, impairment (alcohol or drugs), distraction, excessive speed, failure to secure equipment, improper switching, and crew communication errors. Human factors is historically the largest cause group by incident count across the Form 54 database, accounting for approximately 35 to 38 percent of all reported accidents in recent decades. Employee fatigue, addressed by FRA Hours of Service regulations under 49 CFR Part 228, is a recurring subcode theme in passenger and freight train incidents.
  • Track (T) — codes covering physical defects in the track structure: broken or defective rail, rail joint failure, wide gauge, track geometry anomalies (surface, alignment, crosslevel), tie defects, and roadbed failures. Track causes account for roughly 35 percent of derailments in the Form 54 database—slightly more than human factors as a share of the derailment subtype. Rail defects, particularly transverse detail fractures and bolt hole cracks, are the most common track cause subcode. FRA track inspectors (federal and railroad-employed) are required to identify and remediate track defects before they cause accidents; the track inspection database, published separately from the accident data, provides context for whether inspections preceded an accident.
  • Equipment (E) — codes covering mechanical failures in the rolling stock: wheel failures (broken flanges, flat spots), axle failures, bearing failures (particularly hot box failures that preceded the introduction of wayside detector networks), coupler failures, and brake system failures. Equipment causes were more prominent in older dataset years; mechanical improvements in freight car design and the widespread deployment of wayside detector systems (hot box detectors, wheel impact load detectors, dragging equipment detectors) have shifted the equipment failure share downward over the 50-year dataset.
  • Signal and communication (S) — codes covering signal system failures, improper signal use, train control system failures, and communication failures between crews and dispatchers. This cause group has taken on increased analytical significance with the mandated implementation of Positive Train Control (PTC) systems under the Rail Safety Improvement Act of 2008. Accidents coded to signal and communication causes that occurred before PTC implementation can be assessed against whether PTC—which automatically enforces speed restrictions and stop signals—would have prevented the accident.
  • Miscellaneous and other (M) — a residual category for accidents where the cause is established but does not fit the four preceding primary groups. Grade crossing collisions where the railroad was not at fault are sometimes coded here; the Form 57 database is a more appropriate source for those events.
  • Weather and environment (W) — codes covering weather-related causes: flash floods undermining track subgrade, ice and snow accumulation on switches, high winds affecting high-profile loads, fog reducing sight distance, and extreme heat causing track buckling (sun kink). Weather causes are relatively low in overall incident share but are overrepresented in high-severity events, particularly flash flood derailments where entire track sections wash out under moving trains.

Form 57 data structure: grade crossing accident records

Form 57 records document every collision between a train and a highway user at a grade crossing. The data structure is distinct from Form 54, reflecting the different analytical questions that grade crossing incidents raise:

  • Crossing ID (DOT USDOT number) — the nationally unique identifier assigned to each public highway-rail grade crossing by the Federal Highway Administration. This identifier is the join key to the National Highway-Rail Crossing Inventory, which documents the physical characteristics of every public crossing: number of tracks, crossing surface type, approach grade, sight distance, traffic volume (annual average daily traffic), and warning device type. The crossing ID is the most valuable field in Form 57 for analytical purposes, because it links the incident record to the engineering and device characteristics that determine crossing risk.
  • Highway user type — classification of the vehicle or person struck: automobile, truck (with subcategory for truck with trailer), bus, motorcycle, bicycle, pedestrian, or all-terrain vehicle. Highway user type is a primary stratification variable for casualty analysis: pedestrian and bicycle incidents at grade crossings have very different characteristic profiles than automobile incidents. Pedestrian incidents are often at non-gated locations and involve trespassers on the right-of-way rather than highway users at designated crossings.
  • Sight obstruction — a flag and descriptive field indicating whether sight distance at the crossing was obstructed at the time of the accident. Sight obstruction types include vegetation, structures, parked vehicles, and terrain. Obstructed sight distance is one of the crossing inventory fields that correlates most strongly with accident history; crossings with documented sight obstructions that subsequently experience accidents raise regulatory and liability questions about the adequacy of prior inspection and remediation.
  • Warning device type — the active or passive warning system present at the crossing at the time of the accident: no device (passive), crossbucks only (passive), stop sign (passive), flashing lights (active), flashing lights with gates (active), or constant warning (active). Warning device type is the single most powerful predictor of crossing casualty severity in the Form 57 data. Crossings with gates have dramatically lower casualty rates than crossings with lights only, which in turn are substantially safer than passive-only crossings. FRA's grade crossing safety improvement program has directed federal funding toward upgrading passive crossings on high-traffic corridors for more than 30 years, and the trend in device upgrades is visible in the crossing inventory data.
  • Train speed at accident — the same field as Form 54, but particularly important in the grade crossing context. FRA analyses have documented that a highway vehicle struck by a train traveling above 50 mph has a dramatically higher probability of fatality than one struck at slower speeds. Speed interacts with warning device type: at gated crossings, the gate activation sequence is calibrated to the train's approach speed, and trains operating above track speed limits may arrive before the gate sequence completes.

The long-term safety improvement: what five decades of data show

The most striking feature of the FRA accident databases, viewed over the full 1975–2025 period, is the magnitude of the safety improvement. Train accident rates—measured as accidents per million train miles operated—have fallen by approximately 80 percent since the early 1980s peak. Grade crossing fatalities have fallen from more than 900 per year in the mid-1970s to below 250 in recent years, even as train traffic volumes have grown substantially over the same period.

Three structural changes drive most of the improvement visible in the data. First, the nationwide deployment of wayside detector systems—hot box detectors (which identify overheating roller bearings before they fail), wheel impact load detectors (which identify damaged wheels producing dangerous dynamic loads on the track), and dragging equipment detectors—transformed equipment failure rates from the 1980s onward. Equipment cause accidents that were a leading derailment category through the 1970s declined sharply as detector networks matured. The Class I railroads (the seven largest freight carriers by revenue) have detector spacing on mainlines measured in tens of miles, not hundreds.

Second, the progressive implementation of Positive Train Control under the Rail Safety Improvement Act of 2008, completed across the Class I railroad mainline network by the 2020 statutory deadline, addresses the largest remaining category of signal-and-human-factor accidents. PTC systems automatically enforce speed restrictions, stop signals, and broken-rail protections without depending on crew compliance. NTSB has documented dozens of pre-PTC accidents that the system would have prevented; the accident database from 2020 onward provides the first systematic test of PTC's preventive effect at scale.

Third, track geometry inspection has been transformed by continuous measurement technology. The FRA's track inspection standards under 49 CFR Part 213 require periodic inspection of all tracks; the standards are class-dependent, with higher-speed track subject to more frequent inspection. Track geometry cars—specialized vehicles equipped with lasers and accelerometers that measure gauge, surface, alignment, and crosslevel at track speed—have replaced much of the manual walking inspection on main lines, enabling defect detection at a frequency and precision that manual methods cannot match.

How to access the FRA safety data

FRA makes its accident data publicly available through several access channels with different trade-offs between ease of use and analytical completeness:

  • FRA Safety Data Portal safetydata.fra.dot.gov is FRA's primary public access point. The portal provides web-based query interfaces for both Form 54 and Form 57 data, with filtering by date range, railroad, state, accident type, and cause code. Results can be exported as CSV files. The portal also provides access to the Highway-Rail Crossing Inventory, the FRA employee casualty data, and the rail equipment accident/incident data. It is the best starting point for exploratory analysis before moving to bulk downloads.
  • Bulk CSV downloads by year and form type — FRA publishes annual bulk files for Form 54 and Form 57 separately, available through the Safety Data Portal download section. Files are organized by report year, with each year's file containing all accidents reported in that calendar year. Column layouts are consistent across years for the modern database format, though pre-1990 data uses a legacy format that requires a separate schema reference. The bulk files are the right format for longitudinal analysis and for building a local research database spanning the full 1975–present range.
  • FRA API — FRA provides a REST API for programmatic access to safety data at https://railroads.dot.gov/rail-network-development/planning-and-data. The API supports JSON responses with filtering parameters for date, railroad, state, and accident type. It is suitable for targeted queries and real-time monitoring applications but less efficient than bulk CSV downloads for full historical analysis.
  • National Highway-Rail Crossing Inventory — the companion dataset for Form 57 analysis, available as a bulk download from the Safety Data Portal. The inventory contains a record for every public and many private grade crossings, with the DOT USDOT crossing number as the join key. Any grade crossing risk analysis that uses Form 57 data should be joined to the inventory to obtain crossing physical characteristics.

Three research use cases

The FRA databases support a range of analytical questions that are particularly tractable with the public data. Three illustrate the range of research directions the data enables.

Derailment cause by track class. Form 54's combination of track class and cause code fields enables a direct comparison of failure modes between mainline and branch line operations. On Class 4 and 5 mainline track, where trains operate at 60 to 90 mph and the infrastructure investment per mile is high, track geometry defects are proportionally less common as a derailment cause than on Class 1 and 2 branch line track, where maintenance investment is lower and geometry tolerances are more lenient. Equipment failures, particularly wheel and bearing failures, are proportionally higher on mainlines because the greater mileage and speed put more stress on rolling stock components. Human factor causes—particularly excessive speed—are more common on mainlines than on low-speed branch operations. Stratifying Form 54 derailment records by track class and computing cause-group shares within each class quantifies these structural differences and identifies which classes have seen the most improvement over the 50-year dataset.

Grade crossing risk: vehicle type, warning device, and casualty probability. Form 57 joined to the National Crossing Inventory enables a multivariate casualty probability analysis. The key covariates are highway user type (automobile vs. truck vs. pedestrian), warning device type (none vs. passive vs. active lights vs. gated), train speed at accident, and sight distance (obstructed vs. clear). A logistic regression predicting fatality given an accident has occurred consistently finds that gated crossings reduce fatality probability by more than 50 percent relative to passive crossings, controlling for vehicle type and speed. Trucks struck by trains have higher fatality rates than automobiles, controlling for crossing and train characteristics, because the collision geometry is different: a train striking the cab of a tractor-trailer is more likely to result in driver death than a train striking the passenger compartment of an automobile. This analysis requires linking Form 57 accident records to crossing inventory records on the DOT USDOT crossing ID, computing crossing-level accident rates normalized by annual average daily traffic, and then modeling casualty probability within the accident subset.

East Palestine context: hazmat derailments in the preceding decade. The February 3, 2023 Norfolk Southern derailment near East Palestine, Ohio—in which 38 cars derailed, 11 carrying hazardous materials, resulting in a controlled release and burn of vinyl chloride—drew national attention to hazmat derailment risk on mainline freight railroads. Form 54 provides the context for understanding how unusual or typical the East Palestine derailment was relative to the preceding decade of hazmat derailment events. Filtering Form 54 for records where accident type is derailment and hazmat involvement flag is set, covering 2013–2023, surfaces several hundred similar events with varying release quantities, cause codes, and track classes. The East Palestine derailment was caused, per the NTSB preliminary finding, by a wheel bearing failure that was not detected by the wayside hot box detector network in time to stop the train before the bearing failed catastrophically. Form 54 equipment cause subcodes for bearing failures in the decade before East Palestine show whether the derailment was an outlier event or part of a trend that the data could have identified prospectively.

Cross-references for complete rail accident analysis

The FRA accident databases are most analytically productive when combined with companion data sources that provide the post-incident investigation findings and regulatory context that the accident records alone cannot supply.

NTSB railroad accident investigation reports. The National Transportation Safety Board investigates major railroad accidents independently of FRA, producing detailed public investigation reports with root cause findings, contributing factor analysis, and safety recommendations. NTSB railroad accident reports go substantially deeper than FRA Form 54 records: they include post-accident track inspection findings, event recorder data (the rail equivalent of a flight data recorder), interview findings, and laboratory analysis of failed components. For any significant accident that generated an NTSB investigation, the NTSB report is the authoritative source for understanding the actual cause—and for assessing whether the FRA cause code in the Form 54 record accurately reflects the investigation finding.

PHMSA hazardous materials transport data. For derailments involving hazardous materials release, PHMSA is the co-regulator for the hazmat aspect of the incident. PHMSA's hazmat incident database (separate from its pipeline databases) covers all hazmat transportation incidents including rail, and provides fields for commodity released, quantity released, and emergency response actions taken. Joining FRA Form 54 hazmat derailment records to PHMSA hazmat incident records provides a more complete picture of the environmental and safety consequences of hazmat derailments than either database provides alone.

FRA track inspection data. FRA publishes a separate track inspection database documenting both federal inspector and railroad self-inspection findings, covering identified defects by track class, railroad, and location. For track-caused derailments, the inspection data provides the critical pre-accident context: was the defect that caused the derailment previously identified and not remediated, or was it an undetected failure? Linking Form 54 track-cause derailment records to prior inspection findings on the same railroad segment (using railroad reporting mark and approximate milepost range) surfaces cases where known defects preceded accidents—a direct test of whether inspection requirements are functioning as the regulatory model intends.

DOT National Pipeline and Hazardous Materials Safety Administration. Beyond the hazmat incident database noted above, PHMSA publishes regulatory enforcement data covering civil penalties against carriers for hazmat transportation violations. For railroads that experience repeated hazmat derailments, the PHMSA enforcement record documents whether those incidents generated regulatory responses proportional to the risk level. The penalty gap between statutory maximum and assessed penalty is as present in railroad hazmat enforcement as in pipeline safety enforcement.

Limitations of the FRA datasets

The FRA accident databases are among the most granular federal transportation safety datasets available, but several systematic limitations shape what the data can and cannot support analytically.

Reporting threshold excludes minor accidents. The $10,800 property damage threshold for Form 54 means that minor derailments, small fires, and low-speed collisions that produce less than threshold damage are not reported nationally. In low-traffic environments—short-line railroads, industrial switching operations, port facilities—sub-threshold accidents may substantially outnumber reportable ones. The national database documents the upper tail of the accident distribution on short lines, not the full distribution. Class I mainline accidents are more completely captured because the property damage threshold is easily exceeded in any mainline derailment.

Railroad self-reporting bias in cause code assignment. Cause codes in Form 54 are assigned by the reporting railroad, not by an independent investigator. Railroads have structural incentives to assign causes to track defects or equipment failures (which can be attributed to maintenance cycles or manufacturer defects) rather than to human factors (which create liability exposure for crew behavior and operating practices). NTSB investigations of major accidents frequently find human factor contributions that the initial FRA Form 54 record attributed to track or equipment. The cause code field is analytically useful for trend analysis but should not be treated as a definitive finding for individual incidents without cross-referencing investigation records.

Grade crossing data requires inventory join for crossing characteristics. Form 57 alone provides the accident record; it does not embed the crossing physical characteristics (sight distance, traffic volume, warning device at time of accident) needed for risk factor analysis. Those characteristics come from the National Highway-Rail Crossing Inventory, and the join is on the DOT USDOT crossing ID. Two complications arise: first, the inventory is a current snapshot updated by railroads and state agencies, not a historical time series, so crossing characteristics recorded today may not accurately reflect conditions at the time of a historical accident. Second, private crossings—which account for a substantial share of all crossings—are included in the inventory only partially, and Form 57 records for private crossing accidents may not have a matching inventory record. Restricting analysis to public crossings with complete inventory records limits the dataset but improves data integrity.

Related writing

For the PHMSA pipeline incident databases—how the four pipeline safety datasets (gas distribution, gas transmission, hazardous liquids, LNG) are structured, what the cause taxonomy reveals about corrosion and excavation damage failure patterns, and the gap between statutory penalties and assessed penalties in major pipeline incidents: Pipeline spills and explosions: using PHMSA incident data to map 50 years of pipeline failures →

For the NHTSA FARS traffic fatality database—how the Fatality Analysis Reporting System structures 1.1 million crash deaths since 1975, what the 30+ per-year files contain, and how to cross-reference crash vehicles against open NHTSA recall campaigns: Every US traffic death since 1975: using NHTSA FARS to analyze road safety, vehicle defects, and enforcement gaps →

For the NTSB aviation accident database—how the National Transportation Safety Board structures 90,000+ accident reports since 1962, what the probable cause taxonomy reveals about general aviation risk versus commercial aviation, and how investigation findings translate into FAA rulemaking: Every US plane crash since 1962: using the NTSB aviation accident database →