Technical writing
BTS Airline On-Time Performance: The Federal Data Behind Every Flight Delay
Every time a commercial flight lands more than fifteen minutes late, that delay enters a federal database. The Bureau of Transportation Statistics collects roughly six million domestic flight records per year—every departure, every arrival, every delay coded by cause—and publishes the full dataset through its Transtats portal. Alongside on-time performance data, BTS maintains T-100 capacity and traffic statistics for every domestic and international route and Form 41 quarterly financials for the major carriers. Together, these three datasets constitute the most comprehensive public record of airline operations and economics anywhere in the world.
The Bureau of Transportation Statistics
The Bureau of Transportation Statistics is the federal statistical agency for transportation, established by Congress in 1992 and housed within the Department of Transportation. BTS is the transportation equivalent of the Bureau of Labor Statistics or the Census Bureau: its mandate is to produce and publish transportation data that meets federal statistical standards for reliability, objectivity, and independence. Unlike the FAA, which both regulates aviation and collects safety data, BTS is a pure statistical agency with no regulatory enforcement role.
BTS's primary public data access point is the Transtats portal at transtats.bts.gov. Transtats provides interactive query tools, bulk download interfaces, and API access for the major aviation datasets. The portal covers commercial airline on-time performance, T-100 domestic and international traffic statistics, Form 41 carrier financial data, commodity and freight statistics, border crossing data, and several other transportation datasets. BTS also produces the monthly Air Travel Consumer Report in coordination with DOT's Office of Aviation Consumer Protection and publishes the National Transportation Statistics annual compendium.
BTS collects airline data under the authority of 49 U.S.C. § 41704, which gives the Secretary of Transportation the power to require air carriers to file reports. The data collection requirements are implemented through DOT rules that specify which carriers must report, what fields must be reported, and the filing deadlines. BTS data collection is distinct from FAA safety reporting: a carrier can be current on all BTS statistical filings while having open FAA enforcement actions, and vice versa.
ATOP: Airline On-Time Performance Data
The Airline On-Time Performance database—known internally as ATOP and previously marketed as the Air Carrier Statistics / ASQP (Airline Service Quality Performance) program—is the primary federal source for flight-level on-time data. ATOP data is required of any air carrier that accounts for at least one percent of total domestic scheduled-service passenger revenue. In practice this means the ten to fourteen largest domestic carriers report: American, Delta, United, Southwest, Alaska, Spirit, Frontier, JetBlue, Hawaiian, Allegiant, and a few others depending on the year. Regional carriers operating under mainline branding (Mesa operating as United Express, SkyWest operating as Delta Connection) report separately under their own IATA codes.
Reporting carriers must file records for every domestic flight they operate within 45 days of the end of each month. The resulting dataset contains approximately six million individual flight records per year, covering every domestic scheduled service departure by covered carriers. International flights are not covered by ATOP; BTS collects separate international traffic statistics through the T-100I program but does not collect international on-time data at the flight level.
Each ATOP record contains a standardized set of fields documenting the planned and actual performance of a single flight:
| Field | Description |
|---|---|
| UNIQUE_CARRIER | IATA carrier code (e.g. WN, AA, DL) |
| FL_NUM | Flight number as published in the schedule |
| ORIGIN / DEST | Three-letter IATA airport codes for origin and destination |
| CRS_DEP_TIME | Computer Reservation System (scheduled) departure time, local hhmm |
| DEP_TIME / ARR_TIME | Actual gate departure and gate arrival times, local hhmm |
| ARR_DELAY | Arrival delay in minutes; positive = late, negative = early |
| CARRIER_DELAY | Minutes of delay attributable to carrier causes |
| WEATHER_DELAY | Minutes attributable to extreme weather |
| NAS_DELAY | Minutes attributable to the National Aviation System |
| SECURITY_DELAY | Minutes attributable to security causes |
| LATE_AIRCRAFT_DELAY | Minutes attributable to inbound aircraft arriving late |
| CANCELLED | 1 if flight was cancelled, 0 otherwise |
| CANCELLATION_CODE | A = Carrier; B = Weather; C = National Aviation System; D = Security |
| DIVERTED | 1 if flight was diverted to an alternate airport |
The delay cause fields (CARRIER_DELAY, WEATHER_DELAY, NAS_DELAY, SECURITY_DELAY, LATE_AIRCRAFT_DELAY) are only populated when total arrival delay exceeds fifteen minutes and the flight was not cancelled or diverted. The delay minutes across the five fields sum to the total arrival delay for delayed flights, though the attribution is done by the carrier and involves judgment calls that are not independently verified by BTS. Airlines have an obvious incentive to minimize the share attributed to carrier causes, and researchers have documented patterns suggesting systematic underreporting of carrier-caused delays.
The Five Delay Categories
DOT defines five distinct categories of airline delay for ATOP reporting purposes. The categories determine who the delay is attributed to—not for regulatory enforcement, but for public transparency and carrier benchmarking.
Carrier delay covers delays caused by circumstances within the carrier's control. This is the broadest category: it includes aircraft maintenance or mechanical issues, crew problems (crew availability, crew rest requirement violations, crew connectivity), aircraft cleaning, baggage loading, and fueling delays. Approximately 30 to 35 percent of total delayed minutes industry-wide are classified as carrier-caused. A carrier that routinely posts high carrier-delay fractions is signaling operational inefficiency that is visible in the public data.
National Aviation System (NAS) delay captures delays within the broad, interconnected NAS: Air Traffic Control (ATC) system volumes, ATC delays not attributed to weather, heavy traffic conditions at airports, and non-extreme weather that affects ATC operations—specifically, weather conditions below the threshold that would constitute a “weather delay” under DOT definitions. NAS delay is the second-largest category, typically 30 to 35 percent of delayed minutes in normal operating years. Heavy-traffic hub airports (ORD, LAX, JFK, ATL) generate disproportionate NAS delay because their traffic density leaves no slack to absorb disruptions.
Weather delay is defined narrowly: only severe weather that is beyond the standards of the NAS qualifies. Fog that reduces visibility below IFR minimums, severe thunderstorms that close sectors, winter storms that necessitate ground stops—these are weather delays. Weather that merely causes ground holds within normal ATC tolerance is classified as NAS delay, not weather delay. As a result, the weather delay category is smaller than most travelers assume, typically five to ten percent of delayed minutes. The bulk of weather impact flows through the NAS delay category.
Late aircraft delay —sometimes called the ripple or propagation effect—is the delay incurred because the aircraft arriving to operate the current flight arrived late from its previous city. A single early-morning mechanical issue can propagate through six or eight rotations of the same aircraft during the day. Late aircraft delay is typically the largest single delay category by minutes, accounting for 35 to 45 percent of all delay minutes industry-wide, because it amplifies every upstream disruption across the day's schedule. Carriers with more schedule slack absorb propagation better; ultra-low-cost carriers running tight aircraft utilization are particularly susceptible.
Security delay covers delays caused by security-related issues: security breaches at the checkpoint or gate, TSA staffing shortages that cause checkpoint queues to delay boarding, or security evacuations of terminals. Security delay is a small fraction of total delayed minutes—typically under one percent—but can cause large individual events when a terminal evacuation grounds all departing aircraft.
The Consumer Air Travel Report
The DOT Air Travel Consumer Report (ATCR) is a monthly publication that translates ATOP data into public-facing metrics for passengers and policymakers. Published by DOT's Office of Aviation Consumer Protection with BTS statistical support, the ATCR is the document journalists cite when reporting on airline on-time performance.
The headline metric is the on-time arrival rate: the percentage of operated flights that arrive within fifteen minutes of their Computerized Reservation System (CRS) scheduled arrival time. An arrival delay of exactly fifteen minutes or less counts as on-time; sixteen minutes or more counts as delayed. Industry-wide on-time arrival rates in normal operating years run between 78 and 85 percent. Individual carrier rates vary significantly: full-service network carriers at congested hub airports tend to run lower on-time rates than point-to-point carriers operating at less congested airports. In 2022, the industry average fell to approximately 75 percent, driven by staffing shortages and demand surges following the end of COVID capacity restrictions.
The ATCR also tracks the cancellation rate: the percentage of scheduled flights that were cancelled rather than operated. A one percent industry cancellation rate is roughly 50,000 to 60,000 cancelled flights per year. Cancellations above two percent trigger media and congressional attention. The Southwest Airlines operational meltdown in December 2022—when the carrier cancelled approximately 17,000 flights over a ten-day period following Winter Storm Elliott, due to a failure in its crew-scheduling software that could not handle the volume of crew-reassignment requests the storm generated—pushed Southwest's monthly cancellation rate to extraordinary levels and prompted a DOT investigation and $140 million settlement, the largest consumer protection enforcement action in DOT's history at the time.
Extreme delays—domestic flights delayed more than three hours from their scheduled departure—are tracked separately because they represent a qualitatively different passenger experience and trigger specific DOT compensation obligations. International flights delayed more than six hours from their scheduled departure are the international equivalent. Tarmac delays (aircraft sitting on the ground with passengers onboard, unable to deplane) exceeding three hours domestically or four hours internationally are subject to mandatory fines under the DOT tarmac delay rule.
T-100: Domestic and International Traffic Statistics
The T-100 program is the foundational capacity and traffic statistics dataset for US commercial aviation. It is separate from ATOP: where ATOP covers on-time performance for domestic flights at the individual flight level, T-100 covers traffic and capacity at the monthly route level for all scheduled and charter service, including international.
Form T-100D covers domestic service operated by US air carriers: every domestic route, every carrier, every month. Each record represents one carrier operating one origin-destination city pair in one month of one year. The key fields are:
| Metric | Definition |
|---|---|
| Available Seat Miles (ASM) | Total seats × miles flown; the measure of capacity deployed |
| Revenue Passenger Miles (RPM) | Revenue passengers carried × miles flown; the measure of traffic |
| Load Factor | RPM ÷ ASM; the percentage of capacity filled by revenue passengers |
| Revenue Passengers | Count of ticketed revenue passengers enplaned |
| Aircraft Departures Performed | Number of actual departures on the route in the month |
Form T-100I covers international service, including both US carriers and foreign carriers authorized to serve the US market under bilateral air service agreements. Foreign carriers report to DOT under 49 U.S.C. § 41702 as a condition of their operating authority. T-100I data therefore includes routes operated by Lufthansa, British Airways, Aer Lingus, LATAM, and dozens of other foreign carriers—making it the only public US government source for traffic statistics on US-foreign city-pair routes operated by non-US carriers.
Load factor is the ratio T-100 makes most visible. Industry load factors have risen dramatically over the past two decades: in 2000, the domestic industry average load factor was approximately 72 percent; by 2019 it had reached 85 percent. Ultra-low-cost carriers typically run load factors of 87 to 90 percent, while full-service network carriers operating premium-heavy transoceanic routes sometimes report load factors in the high 80s on the most competitive routes. Higher load factors mean less slack in the system: when a flight is cancelled at 90 percent load factor, re-accommodating passengers is harder than at 75 percent, which contributes to the cascading operational problems visible in the ATOP data during periods of high demand.
Form 41: Carrier Financial Statistics
Form 41 is the DOT's quarterly financial reporting requirement for certificated air carriers. Major carriers file a series of financial schedules covering revenue, costs, employment, balance sheet items, aircraft operations costs, and fuel. The Form 41 data, published through BTS Transtats, is the primary public source for airline financial performance at the carrier level.
Schedule P-1.2 is the profit and loss statement by revenue type: passenger revenue, cargo revenue, mail revenue, other transport revenue, and non-transport revenue. It is the basis for computing Revenue per Available Seat Mile—RASM—by dividing total operating revenue by ASM from T-100. RASM is one of the two primary airline performance benchmarks used by equity analysts and in antitrust analysis.
Schedule T-2 covers aircraft operations costs: fuel and oil expense, crew costs, maintenance, depreciation and amortization of flight equipment, and other direct operating expenses. Combined with ASM, T-2 data produces Cost per Available Seat Mile—CASM—the primary cost efficiency metric for airlines. A carrier with lower CASM can sustain lower fares than a higher-CASM competitor; the relentless CASM reduction pursued by ultra-low-cost carriers through ancillary revenue, higher seat density, and faster aircraft turns is visible in the Form 41 data year over year.
Schedule F is the fuel cost and consumption schedule, distinguishing jet fuel expense from gallons consumed, enabling computation of the effective per-gallon fuel cost at the carrier level. Fuel typically represents 20 to 30 percent of total operating costs in a normal-price year and has spiked to 35 percent or more during high-price periods. The 2022 fuel price surge following Russia's invasion of Ukraine and associated commodity market disruptions drove fuel from roughly $2.00 per gallon in 2021 to over $3.50 per gallon at peak, with direct impact on carrier margins visible in the quarterly Form 41 data.
Schedule P-12 covers employment by labor category: pilots and copilots, other flight personnel, passenger service employees, maintenance employees, traffic handling, and administrative. The P-12 data combined with P-1.2 revenue data produces labor cost per employee and labor cost as a fraction of total operating expense, key metrics in collective bargaining and in DOT/DOJ merger review.
Schedule B-1 is the balance sheet: total assets, flight equipment net of depreciation, total debt, total equity. The B-1 data is used in merger antitrust analysis to assess carrier financial health, the ability to fund aircraft orders, and the debt load that constrains competitive behavior. When the DOJ challenged the American Airlines–US Airways merger in 2013, Form 41 B-1 and T-2 data formed a significant portion of the evidentiary record on each carrier's financial position and cost structure.
The COVID Airline Crisis
No discussion of BTS data is complete without addressing the 2020 COVID airline crisis, which produced the most dramatic collapse in commercial aviation demand in the history of US air travel and left a visible discontinuity in every BTS dataset.
In April 2020, domestic and international air travel effectively ceased. BTS T-100 data shows that revenue passenger miles fell 96 percent year-over-year in April 2020 compared to April 2019—the largest single-month demand shock ever recorded in the dataset. The US airline industry carried approximately 926 million passengers in 2019; in 2020 it carried 369 million, most of them in January and February before the collapse. Many carriers operated ghost schedules with load factors of 5 to 15 percent, unable to shut down completely because of slot requirements, essential service obligations, and contractual minimums with airports.
The federal government responded with the CARES Act, enacted in March 2020, which authorized $25 billion in payroll support grants and $25 billion in loans to passenger carriers. Payroll Support Program (PSP) grants were distributed to carriers in proportion to their labor costs, with the condition that carriers maintain employment levels and refrain from involuntary furloughs through September 2020. PSP extensions in December 2020 and March 2021 extended the employment protection through September 2021. In total, the three rounds of PSP provided approximately $54 billion to the airline industry and are credited with preventing the mass layoffs and carrier bankruptcies that industry analysts projected in spring 2020.
The Form 41 data for 2020 shows the financial impact with stark clarity: American, Delta, United, and Southwest together reported combined operating losses of approximately $35 billion for the year. Operating margins, which had averaged six to ten percent in the preceding boom years, fell to negative 50 to 60 percent at the low point. The balance sheet schedules show the debt load that carriers took on to survive: Delta alone added approximately $10 billion in debt during 2020, ending the year with total debt exceeding $17 billion.
The recovery was faster than most forecasts anticipated. Domestic leisure travel surged in summer 2021 as vaccination rates rose and travel restrictions lifted. By August 2021, domestic RPM had recovered to approximately 85 percent of August 2019 levels. International recovery lagged, particularly transatlantic and transpacific routes dependent on business travel and subject to longer country-level entry restrictions. The ATOP data for 2021 and 2022 captures the operational consequences of recovery: carriers that had shed experienced employees during 2020 found it difficult to restore capacity quickly, producing the above-average cancellation and delay rates that characterized summer 2021 and summer 2022.
Baggage, Complaints, and Consumer Data
The Air Travel Consumer Report covers three additional consumer-facing metrics beyond on-time performance: baggage mishandling, consumer complaints, and involuntary denied boarding.
Mishandled baggage is reported monthly as the number of mishandled baggage reports per one thousand passengers enplaned. Mishandled baggage includes delayed (routed to the wrong destination), damaged, or lost bags, and also wheelchair and motorized scooter mishandling, which is reported separately following 2018 DOT rule changes requiring detailed documentation of mobility device damage. Industry mishandled baggage rates fell significantly between 2007 and 2022 as airlines invested in RFID bag-tracking systems and automated sorting infrastructure. The rate reached a historic low of approximately 2.0 per thousand in 2019 before rising slightly during the 2022 staffing disruptions.
Consumer complaints are tabulated by carrier and category. The DOT complaint categories include: flight problems (cancellations and delays), baggage, customer service, refunds, fares, reservations and ticketing, disability, and advertising. Flight problems and refunds are typically the two largest categories in volume. The COVID refund crisis of spring 2020—when carriers initially refused cash refunds for cancelled flights, offering travel credits instead—generated a spike in refund complaints that DOT used as the basis for enforcement guidance clarifying that passengers are entitled to cash refunds for carrier-cancelled flights, not travel credits.
Involuntary denied boarding (IDB)—bumping passengers who hold confirmed reservations because the aircraft was oversold—is reported quarterly per ten thousand passengers. Carriers are required to seek volunteers before involuntarily denying boarding and to pay compensation to IDB passengers: 200 percent of the one-way fare for delays of one to two hours on domestic flights, 400 percent for delays over two hours, in each case capped at $1,675 (as of the current DOT rule). The compensation formula was updated in 2023. IDB rates have fallen substantially as carriers moved to more sophisticated yield management that reduces overbooking frequency, reaching under 0.5 per ten thousand in recent years.
The Tarmac Delay Rule
The tarmac delay rule is one of the most consequential consumer protection regulations in aviation and one whose quantitative impact is directly visible in BTS data. Before the rule's implementation, passengers could be—and regularly were—held on aircraft on the tarmac for four, five, or more hours without the ability to disembark, with no food, water, or functioning lavatories.
The event that drove the regulation was the JetBlue Valentine's Day 2007 crisis, in which an ice storm at JFK Airport caused JetBlue to keep passengers aboard aircraft on the tarmac for up to ten hours rather than acknowledge the cancellations and return them to the terminal. Approximately nine aircraft were stranded; passengers documented the conditions via social media. Congressional and DOT attention followed immediately.
DOT's tarmac delay rule, effective in December 2009 for domestic flights and extended to international flights in August 2011, establishes the following requirements:
| Requirement | Domestic Threshold | International Threshold |
|---|---|---|
| Allow passengers to deplane | 3 hours | 4 hours |
| Provide food and water | 2 hours | 2 hours |
| Maintain operable lavatories | Throughout | Throughout |
| Provide medical attention if requested | Throughout | Throughout |
Exceeding the three-hour (domestic) or four-hour (international) tarmac limit without allowing passengers to deplane is an unfair and deceptive practice under 49 U.S.C. § 41712, subject to fines of up to $27,500 per passenger. A single wide-body aircraft with 280 passengers held for four hours on a domestic flight without being allowed to deplane could theoretically result in a $7.7 million fine. The magnitude of the per-passenger penalty structure creates strong carrier incentives to cancel a flight and return to the gate rather than risk a tarmac violation.
The impact on tarmac delay statistics was immediate and dramatic. BTS's tarmac delay data (published separately from ATOP, also on Transtats) shows that tarmac delays exceeding three hours fell from approximately 700 per year in 2007–2009 to under 30 per year by 2013, and have remained at historically low levels since. The DOT attributes this reduction directly to the rule. Critics note that the rule has an unintended consequence: carriers cancel flights earlier during bad weather rather than risk tarmac violations, which can mean more total disrupted passengers even if those passengers are not stranded on aircraft.
Data Access: BTS Transtats and Related Sources
BTS Transtats at transtats.bts.gov is the primary interface for accessing the aviation datasets described in this article.
ATOP / On-Time Performance. The interactive query tool at transtats.bts.gov/DL_SelectFields.aspx?gnoyr_VQ=FGK allows selection by carrier, year, month, origin airport, and destination airport, with field-level selection for the fields to include in the download. BTS also maintains a bulk download interface with pre-packaged monthly ZIP files containing all carriers for a given year-month. The URL pattern is stable and allows programmatic download of entire years of data without the interactive interface.
T-100 domestic and international. The T-100 tables are available through the Transtats T-100 Market and T-100 Segment query interfaces. BTS also provides a REST API for T-100 data accessible at api.bts.gov, which supports filtering by carrier, origin, destination, year, and month and returns JSON or CSV. The API documentation is available through the BTS developer portal.
Form 41 financials. Form 41 schedule data is available through the Transtats financial data portal under the Air Carrier Financial Reports section. Schedules P-1.2, P-12, B-1, T-2, and F are all available for download. Data is available quarterly back to the early 1990s for major carriers; historical coverage varies by schedule. The Air Carrier Statistics (Form 41) database on Transtats provides both individual schedule downloads and combined annual summary files.
Consumer Report data. Monthly ATCR tables are published on the DOT Office of Aviation Consumer Protection website at transportation.gov/airconsumer. The PDF reports include data tables; BTS also provides the underlying data files for mishandled baggage, consumer complaints, and denied boarding separately through the Transtats portal.
FRED aggregate statistics. The Federal Reserve Bank of St. Louis's FRED database includes several BTS-derived series for aggregate aviation analysis: the RPM series (Air Carrier Traffic Statistics), load factor, and available seat miles. FRED's time-series interface provides convenient access for users who need the aggregate national trend rather than carrier- or route-level detail.
Python: Downloading ATOP Data and Computing Monthly On-Time Rate
The following script downloads twelve monthly ATOP ZIP files from BTS Transtats for a specified carrier and year, computes the monthly on-time arrival rate, average arrival delay, and cancellation rate, and plots the results as a dual-axis bar and line chart. The on-time rate appears as bars against the left axis; average arrival delay appears as a line against the right axis; the cancellation rate is annotated inside each bar.
import requests
import pandas as pd
import io
import zipfile
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
# BTS Transtats: Airline On-Time Performance (ATOP) data
# Portal: https://transtats.bts.gov/DL_SelectFields.aspx?gnoyr_VQ=FGK
#
# BTS provides two download interfaces:
# 1. Interactive portal at transtats.bts.gov -- select carrier, year, month, fields
# 2. Direct ZIP URL for a specific carrier/year/month combination
#
# Direct URL pattern for the pre-packaged monthly ZIP files:
# https://transtats.bts.gov/PREZIP/On_Time_Reporting_Carrier_On_Time_Performance_1987_present_<YEAR>_<MONTH>.zip
#
# Each ZIP contains a single CSV with all reporting carriers for that year-month.
# We download 12 months, compute monthly on-time rate and average arrival delay,
# and plot a dual-axis chart (on-time rate bar + average delay line).
CARRIER = "WN" # Southwest Airlines IATA code; change to "AA", "DL", "UA", etc.
YEAR = 2023 # Data year (1987-present available)
BASE_URL = "https://transtats.bts.gov/PREZIP/On_Time_Reporting_Carrier_On_Time_Performance_1987_present"
records = []
for month in range(1, 13):
url = f"{BASE_URL}_{YEAR}_{month}.zip"
try:
resp = requests.get(url, timeout=120)
resp.raise_for_status()
except Exception as exc:
print(f" Month {month}: download failed -- {exc}")
continue
with zipfile.ZipFile(io.BytesIO(resp.content)) as zf:
csv_name = next(
(n for n in zf.namelist() if n.endswith(".csv")),
None,
)
if csv_name is None:
print(f" Month {month}: no CSV found in ZIP")
continue
with zf.open(csv_name) as f:
df = pd.read_csv(f, encoding="latin-1", low_memory=False)
# Normalize column names
df.columns = [c.strip().upper() for c in df.columns]
# Filter to the carrier we want
carrier_col = next(
(c for c in df.columns if "UNIQUE" in c and "CARRIER" in c),
"UNIQUE_CARRIER",
)
sub = df[df[carrier_col].str.strip() == CARRIER].copy()
if sub.empty:
print(f" Month {month}: no records for carrier {CARRIER}")
continue
# On-time arrival: ARR_DELAY <= 15 minutes (DOT definition)
# ARR_DELAY is in minutes; positive = late, negative = early
sub["ARR_DELAY_NUM"] = pd.to_numeric(sub.get("ARR_DELAY", pd.Series(dtype=float)), errors="coerce")
# Exclude cancelled and diverted flights from on-time rate denominator
cancelled_col = "CANCELLED"
diverted_col = "DIVERTED"
if cancelled_col in sub.columns:
sub["CANCELLED_NUM"] = pd.to_numeric(sub[cancelled_col], errors="coerce").fillna(0)
else:
sub["CANCELLED_NUM"] = 0
if diverted_col in sub.columns:
sub["DIVERTED_NUM"] = pd.to_numeric(sub[diverted_col], errors="coerce").fillna(0)
else:
sub["DIVERTED_NUM"] = 0
# Operated flights only
operated = sub[(sub["CANCELLED_NUM"] == 0) & (sub["DIVERTED_NUM"] == 0)].copy()
operated = operated.dropna(subset=["ARR_DELAY_NUM"])
total_flights = len(sub)
operated_flights = len(operated)
cancelled_count = int(sub["CANCELLED_NUM"].sum())
on_time_count = int((operated["ARR_DELAY_NUM"] <= 15).sum())
on_time_rate = round(on_time_count / operated_flights * 100, 1) if operated_flights > 0 else float("nan")
avg_delay = round(operated["ARR_DELAY_NUM"].mean(), 1)
cancel_rate = round(cancelled_count / total_flights * 100, 1) if total_flights > 0 else float("nan")
records.append({
"month": month,
"total_flights": total_flights,
"operated_flights": operated_flights,
"cancelled": cancelled_count,
"cancel_rate_pct": cancel_rate,
"on_time_count": on_time_count,
"on_time_rate_pct": on_time_rate,
"avg_arr_delay_min": avg_delay,
})
print(
f" Month {month:2d}: {total_flights:6,} flights | "
f"on-time {on_time_rate}% | avg delay {avg_delay} min | "
f"cancel rate {cancel_rate}%"
)
summary = pd.DataFrame(records).sort_values("month")
print("
Full year summary:")
print(summary.to_string(index=False))
# --- Plot: monthly on-time rate (bars) and average arrival delay (line) ---
months = summary["month"].tolist()
on_time = summary["on_time_rate_pct"].tolist()
avg_delay = summary["avg_arr_delay_min"].tolist()
cancel_rate = summary["cancel_rate_pct"].tolist()
month_labels = [
"Jan","Feb","Mar","Apr","May","Jun",
"Jul","Aug","Sep","Oct","Nov","Dec",
]
labels = [month_labels[m - 1] for m in months]
fig, ax1 = plt.subplots(figsize=(11, 5))
# Bars: on-time rate
bars = ax1.bar(labels, on_time, color="#0b4a8f", alpha=0.75, label="On-time rate (%)")
ax1.set_ylim(0, 100)
ax1.set_ylabel("On-Time Arrival Rate (%)", color="#0b4a8f")
ax1.tick_params(axis="y", labelcolor="#0b4a8f")
ax1.yaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: str(int(x)) + "%"))
# Secondary axis: average arrival delay
ax2 = ax1.twinx()
ax2.plot(labels, avg_delay, color="#d62728", marker="o", linewidth=2, label="Avg arrival delay (min)")
ax2.set_ylabel("Average Arrival Delay (minutes)", color="#d62728")
ax2.tick_params(axis="y", labelcolor="#d62728")
# Cancellation rate as text annotations
for i, (lbl, cr) in enumerate(zip(labels, cancel_rate)):
ax1.text(i, 2, str(cr) + "%", ha="center", va="bottom", fontsize=7, color="white", fontweight="bold")
ax1.set_title(
"BTS ATOP: Monthly On-Time Arrival Rate and Average Delay
"
+ CARRIER + " " + str(YEAR) + " -- cancellation rate shown in bars",
fontsize=11,
)
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc="lower left", fontsize=9)
fig.tight_layout()
plt.savefig("bts_atop_" + CARRIER + "_" + str(YEAR) + ".png", dpi=150, bbox_inches="tight")
plt.show()
print("Chart saved.")
A few notes on the implementation. BTS Transtats pre-packages monthly ATOP ZIP files containing all reporting carriers for a given month; the script filters to the requested carrier after download. The on-time threshold is fifteen minutes of arrival delay per DOT definition, applied only to operated (non-cancelled, non-diverted) flights. Cancelled flights are excluded from the on-time rate denominator but included in the cancellation rate numerator. The delay cause fields (CARRIER_DELAY, NAS_DELAY, etc.) are populated only for flights with arrival delay exceeding fifteen minutes, so they are not used in this script's on-time computation. The chart's cancellation rate annotation in the bars uses white text at y=2 to avoid collision with the bar baseline.
Related writing
For FAA aviation safety data—the NTSB accident database, FAA AIDS incident system, NASA's ASRS voluntary reporting program, and wildlife strike records that sit alongside BTS performance data in the full federal aviation data ecosystem: FAA Aviation Safety Data: The Federal Databases Behind Every Plane Crash Investigation →
For BTS border crossing data—how DOT tracks every land-port crossing between the US, Canada, and Mexico by mode (personal vehicle, truck, rail, pedestrian), the CBP data collection method, and how to identify trade corridor congestion from public records: BTS Border Crossings: The Federal Data Behind Every US–Mexico and US–Canada Land Port →
For FHWA highway data—the National Bridge Inventory, Highway Performance Monitoring System pavement conditions, AADT traffic counts, Vehicle Miles Traveled, the Freight Analysis Framework, and the gas tax solvency problem: FHWA Highway Data: The Federal Dataset Behind Bridge Conditions, Pavement Quality, and Traffic Counts →