Technical writing
EPA Air Quality System: The Federal Monitor Network Behind NAAQS Compliance and Pollution Mapping
Every smog alert, every news story about unhealthy air days, and every county-level nonattainment designation issued by the EPA traces back to the same infrastructure: the Air Quality System. AQS is the federal repository for pollution monitor readings collected by more than 4,000 monitoring sites distributed across the United States, operated by a patchwork of state and local air agencies, tribal environmental programs, and EPA regional offices, all reporting hourly and daily concentration data for the six criteria pollutants that federal law requires to be kept below health-protective thresholds. Understanding what AQS measures, how the monitoring network is structured, how concentration data becomes regulatory decisions, and where the data has systematic blind spots is the subject of this piece.
This article covers the architecture of the monitoring network and who operates it; the six criteria pollutants and their National Ambient Air Quality Standards including the 2024 PM2.5 revision; how monitors are sited and what measurement methods are approved; the Air Quality Index calculation and its six health categories; how EPA designates nonattainment areas and what that requires of states; the data quality framework including QA codes and completeness thresholds; how AQS compares to real-time AirNow feeds and low-cost PurpleAir sensors; the health burden literature linking fine particulate matter to mortality; the environmental justice dimension of monitoring network coverage; wildfire smoke and the exceptional events provisions; and how to access AQS data programmatically via the public API.
The monitoring network: who operates monitors and why
The AQS monitoring network is not a single federal program. It is a cooperative system in which the EPA sets the measurement standards, approves monitoring plans, and maintains the data repository, while state and local air quality agencies bear most of the operational responsibility. The Clean Air Act requires states to develop monitoring networks adequate to determine compliance with National Ambient Air Quality Standards and to report data to EPA. In practice, roughly 4,200 active monitoring sites collect measurements across the country, operated by approximately 150 separate state, local, and tribal agencies. EPA directly operates only a small fraction of monitors, mostly in remote reference sites.
The institutional reason for state and local operation is capacity and geography. A state air agency knows where its major industrial sources are, where its population centers are, and where its air quality problems have historically concentrated. Local air districts—particularly in California, where the Air Resources Board and roughly 35 local air quality management districts share regulatory authority— maintain dense monitoring networks in urban areas and near large industrial facilities. Tribal environmental programs, funded partly through EPA grants, operate monitors on tribal lands where state jurisdiction does not extend. This patchwork produces coverage that is denser in populated and industrialized areas and sparse in rural and agricultural regions.
Monitors report data to AQS on a schedule that varies by measurement type. Continuous analyzers for ozone, CO, SO2, NO2, and some PM2.5 monitors transmit hourly averages. Filter-based PM2.5 Federal Reference Method monitors operate on sampling schedules— typically one day in three or one day in six—and are analyzed in a laboratory before the result enters AQS, introducing a lag of several weeks between sampling and data availability. Hourly continuous PM2.5 data from beta attenuation monitors and other continuous instruments enters AQS much faster and feeds the real-time AirNow display system, though that data is not yet validated against reference methods.
The six criteria pollutants and their NAAQS standards
The Clean Air Act directs EPA to establish National Ambient Air Quality Standards for pollutants that “endanger public health or welfare” and that come from “numerous or diverse mobile or stationary sources.” EPA has designated six such criteria pollutants, each with a primary standard (protecting public health) and, for most, a secondary standard (protecting public welfare including crops, ecosystems, and visibility). The standards are concentration limits expressed as specific averaging times: annual averages, 24-hour averages, 8-hour averages, or 1-hour averages, depending on the pollutant's health effects profile.
| Pollutant | Primary NAAQS | Averaging time |
|---|---|---|
| PM2.5 (fine particles) | 9.0 μg/m³ annual; 35 μg/m³ 24-hr | Annual mean; 24-hr |
| PM10 (coarse particles) | 150 μg/m³ | 24-hr |
| Ozone (O3) | 0.070 ppm | 8-hr daily max |
| Carbon monoxide (CO) | 35 ppm (1-hr); 9 ppm (8-hr) | 1-hr; 8-hr |
| Sulfur dioxide (SO2) | 75 ppb | 1-hr |
| Nitrogen dioxide (NO2) | 53 ppb annual; 100 ppb 1-hr | Annual mean; 1-hr |
| Lead (Pb) | 0.15 μg/m³ | 3-month rolling avg |
PM2.5—particles with aerodynamic diameter 2.5 micrometers or less—is the criteria pollutant with the largest public health significance. Fine particles penetrate deep into the alveolar region of the lung, where they cause oxidative stress, inflammation, and systemic cardiovascular effects. In February 2024, the Biden EPA finalized a revision of the annual PM2.5 standard from 12 μg/m³ to 9 μg/m³, the first tightening since 2012. The revision was grounded in the accumulated epidemiological evidence linking long-term PM2.5 exposure to cardiovascular mortality, lung cancer, and all-cause mortality even at concentrations below the prior standard. The Trump administration had proposed rolling back the annual standard in 2020 but withdrew that proposal under legal and scientific pressure; the Biden revision reinstated and extended the regulatory tightening.
Ozone is a secondary pollutant formed by photochemical reactions between nitrogen oxides and volatile organic compounds in sunlight. It is not directly emitted; it forms in the atmosphere from precursor emissions by vehicles, power plants, industrial sources, and even vegetation. Ground-level ozone is a powerful oxidant that causes airway inflammation, reduces lung function, triggers asthma attacks, and increases emergency department visits and hospital admissions for respiratory disease. The current 8-hour standard of 0.070 ppm was set in 2015. A 2022 scientific review recommended tightening it to 0.060 ppm, but no final rule had been issued by 2026.
PM10 covers coarse particles between 2.5 and 10 micrometers, associated primarily with dust, pollen, and mechanical abrasion sources. It is less health-relevant than PM2.5 at equivalent mass concentrations because the particles do not penetrate as deeply into lung tissue, but it remains regulated as an indicator of coarse particulate burden. Sulfur dioxide originates primarily from coal combustion and metal smelting; its sharp reduction since 1990 is one of the Clean Air Act's most visible successes, with ambient SO2 concentrations falling more than 90% nationally over 35 years. Carbon monoxide is primarily a combustion product from motor vehicles and is now uncommon at NAAQS-exceedance levels outside localized near- source environments. Lead was removed from gasoline in the 1970s and 1980s, producing a dramatic collapse in ambient concentrations; monitoring now focuses on facilities near lead smelters, battery manufacturers, and scrap metal operations.
Monitor placement: siting criteria and network types
Where a monitor is placed determines what it measures, and EPA's 40 CFR Part 58 regulations specify siting criteria in considerable technical detail. The regulations distinguish three primary placement orientations:
Population-oriented monitors are placed to measure concentrations representative of what people actually breathe. They are sited away from local point sources and placed at heights and in locations that avoid atypical micro-scale exposure conditions. Population-oriented siting is required for NAAQS compliance monitoring in metropolitan areas: the measured concentration should represent the exposure of a substantial population, not just the immediate vicinity of the monitor.
Source-oriented monitors are placed to capture maximum concentrations near identified emission sources—a power plant fence line, a refinery perimeter, a busy highway corridor. Source-oriented monitors are typically not used for NAAQS attainment determinations but provide data for permit compliance monitoring, health impact assessment near facilities, and enforcement support. They are often required as permit conditions for major new sources.
Background monitors are placed far from significant anthropogenic sources, in remote or rural settings, to measure the baseline concentration against which urban and source-influenced readings are compared. Background data is critical for distinguishing local emission contributions from transported pollution and for modeling natural background contributions to ozone and PM2.5.
EPA regulations also define several network types. The State and Local Air Monitoring Stations (SLAMS) network is the core NAAQS compliance monitoring system, operated by state and local agencies under EPA-approved monitoring plans. The National Air Monitoring Stations (NAMS) subset consists of the highest-priority sites within SLAMS—typically in metropolitan areas with the highest pollution burden—subject to more stringent QA requirements. ThePhotochemical Assessment Monitoring Stations (PAMS) network focuses on ozone precursor measurements—speciated volatile organic compounds and nitrogen oxide species—at sites across the major urban ozone nonattainment areas, providing the precursor data needed for photochemical grid model development and State Implementation Plan support.
The measurement method matters for regulatory purposes. A Federal Reference Method (FRM) is the gold-standard method that EPA has designated as the official basis for NAAQS attainment determinations. For PM2.5, the FRM is a 24-hour filter-based gravimetric method: ambient air is drawn through a Teflon filter at a calibrated flow rate, the filter is weighed before and after sampling under controlled humidity conditions, and the mass difference divided by the air volume gives the PM2.5 concentration in μg/m³. A Federal Equivalent Method (FEM)is an alternative measurement approach that has been demonstrated to produce results equivalent to the FRM under field conditions. Beta attenuation monitors (BAM) are the most common PM2.5 FEM, providing near-continuous hourly data by measuring the attenuation of beta radiation through a particle-loaded filter tape. Both FRM and FEM data are used for regulatory attainment determinations; FRM data alone served that role prior to 2009 when FEM PM2.5 equivalency was formally established.
The Air Quality Index: 0 to 500 and what it means
The Air Quality Index is the public-facing translation of pollutant concentration data into a single dimensionless scale from 0 to 500, with six color-coded categories that correspond to health guidance. The AQI was designed to give the public an intuitive measure of daily air quality without requiring them to understand μg/m³ or ppb concentration values.
| AQI range | Category | Health message |
|---|---|---|
| 0–50 | Good | No health concern |
| 51–100 | Moderate | Unusually sensitive individuals may be affected |
| 101–150 | Unhealthy for Sensitive Groups | Elderly, children, asthma patients should limit outdoor exertion |
| 151–200 | Unhealthy | Everyone may experience health effects; sensitive groups severe effects |
| 201–300 | Very Unhealthy | Health alert; everyone should avoid prolonged outdoor activity |
| 301–500 | Hazardous | Health emergency; entire population at risk |
The AQI calculation involves a piecewise linear transformation that maps concentration breakpoints for each pollutant onto the 0–500 scale. For PM2.5, the AQI breakpoints are: 0–9 μg/m³ maps to AQI 0–50; 9.1–35.4 maps to 51–100; 35.5–55.4 maps to 101–150; 55.5–125.4 maps to 151–200; 125.5–225.4 maps to 201–300; and 225.5–325.4 maps to 301–400. The formula within each segment is linear interpolation between the AQI values at the concentration endpoints. The daily AQI is the maximum of the individual pollutant AQI values computed for that day—the worst-of-pollutants approach. If the ozone sub-index is 87 and the PM2.5 sub-index is 103, the reported AQI is 103 (PM2.5 driven) and the health category corresponds to that pollutant.
The AQI breakpoints were updated in May 2024 alongside the PM2.5 NAAQS tightening. The new PM2.5 breakpoints align the “Good” category upper limit with the new 9 μg/m³ annual standard, so a day with PM2.5 concentrations up to 9 μg/m³ remains in the green zone. The prior breakpoints had Good extending to 12 μg/m³, consistent with the old annual standard; the revision means that days formerly reported as Good in the 9–12 μg/m³ range now fall in the Moderate category. This change affects public health communication in areas with chronically elevated PM2.5 and produces apparent discontinuities in historical AQI trend analysis spanning the 2024 revision date.
Nonattainment: designation, SIPs, and the political cycle
When ambient monitoring data shows that an area exceeds a NAAQS standard, EPA must formally designate that area as nonattainment. The designation process begins with EPA issuing a proposed designation based on air quality data from the three most recent complete years of monitoring. States may submit recommended area boundaries and classifications; EPA considers those recommendations but is not bound by them. A final nonattainment designation triggers a cascade of regulatory obligations under the Clean Air Act Title I.
Once designated nonattainment, a state must submit a State Implementation Plan (SIP)—a formal plan demonstrating how the area will achieve the standard within specified deadlines. SIP requirements vary by pollutant and nonattainment classification severity. PM2.5 nonattainment areas are classified as Moderate or Serious depending on the degree of exceedance. A Moderate PM2.5 nonattainment area has six years to attain the standard; a Serious area has nine years, but faces additional SIP requirements including Best Available Control Technology for major stationary sources. Ozone nonattainment areas have a five-tier classification from Marginal to Extreme, with increasingly stringent requirements and longer attainment deadlines at higher tiers.
SIP requirements include reasonably available control measures (RACM) for Moderate PM2.5 areas, best available control measures (BACM) for Serious areas, and quantitative milestones demonstrating reasonable further progress toward attainment at regular intervals. Areas that fail to attain by their deadline are reclassified to a more severe category—a process called “bump-up”—with more stringent requirements and shortened timelines. Persistent failure to attain can trigger federal sanctions including the loss of highway funding and mandatory offsets for new emission sources in the nonattainment area.
The politics of NAAQS revision are intense. Industry groups, state governments, and labor interests regularly challenge NAAQS tightening on economic grounds; public health advocates and scientific advisory panels push for adherence to the scientific evidence regardless of implementation cost. The Clean Air Act explicitly prohibits EPA from considering implementation costs when setting NAAQS— only health evidence is supposed to drive the standard—but administrations have found procedural mechanisms to slow or avoid tightening. The Trump administration in 2020 rejected a Clean Air Scientific Advisory Committee recommendation to tighten the annual PM2.5 standard below 12 μg/m³, citing a disputed economic analysis; the Biden administration reversed course in 2024 and adopted the 9 μg/m³ standard that the scientific evidence had supported for years. The consequence was that dozens of counties previously in attainment at 12 μg/m³ will face nonattainment designation once the 2024 standard takes regulatory effect.
Data quality: QA codes, completeness, and the AQS validation framework
Raw monitor readings in AQS are not uniformly suitable for regulatory use. The AQS data quality framework assigns qualifier codes and null data codes to individual measurements to indicate their validity status and the reason for any missing or flagged data. Qualifier codes identify conditions that affect measurement quality without necessarily invalidating the data: instrument maintenance, equipment malfunction, unusual meteorological conditions, nearby construction, or operator-documented exceptional events. Null data codes identify periods where no valid measurement exists: scheduled maintenance, power outage, calibration, voided samples, or instrument failure.
For NAAQS compliance determinations, EPA applies a data completenessrequirement: a monitoring site must capture valid readings for at least 75% of the required sampling periods within a calendar quarter (for 24-hour PM2.5 averages) or within an 8-hour period (for ozone). A monitor that captures fewer than 75% of required readings in a quarter produces a “incomplete data” flag, and its data may not be used to determine attainment or nonattainment unless EPA approves its use on the basis that the missing data would not have affected the determination. This completeness requirement is both a data quality safeguard and a source of strategic behavior: facilities and their regulatory advocates sometimes argue that missing data during high-pollution periods should be treated as if it would have been below the standard, while regulators may take the opposite position.
The data quality framework in AQS creates an important distinction between the raw monitor database and the regulatory-valid dataset. When working with AQS API downloads or bulk data, analysts should filter on event_type and data quality flags to understand whether they are working with the regulatory baseline, with all measurements including qualified data, or with data that includes exceptional events. The three categories produce materially different annual averages in regions with significant wildfire smoke or industrial upset events.
AQS vs. AirNow vs. PurpleAir: accuracy comparison
Analysts working with air quality data encounter three distinct data streams that are often confused. AQS is the validated, regulatory-grade archive—its PM2.5 data is based on FRM or FEM instruments with full QA review, but it has significant latency: filter-based FRM data may not appear in AQS until weeks or months after sampling. AirNow is EPA's near-real-time public display system, fed by continuous monitor readings (primarily BAM and optical instruments) that have not yet completed full QA review. AirNow data is what powers the AQI colors on weather apps and EPA's air quality forecast maps; it is available within hours of measurement but is preliminary and subject to revision. PurpleAir sensors are low-cost consumer-grade laser particle counters that measure particle scattering and infer PM2.5 through an empirical correction algorithm. PurpleAir data is available in near-real-time at high spatial density and has been adopted widely by health advocates and researchers working in monitoring-sparse areas, but it requires significant correction: raw PurpleAir readings overestimate PM2.5 during wildfire smoke events by 30–80% relative to FRM instruments, and several correction algorithms (EPA's US-wide correction, the LRAPA correction for wood smoke environments) have been developed to close the gap. The AQI values that appear on IQAir and similar air quality apps during wildfire events frequently mix AirNow validated data with uncorrected PurpleAir data, producing AQI values that can exceed 500 and that are not comparable to regulatory AQS data.
Health burden: mortality evidence and BenMAP
The regulatory and public health significance of PM2.5 rests on a large and convergent body of epidemiological evidence linking long-term fine particulate exposure to cardiovascular mortality, lung cancer mortality, and all-cause mortality. Two studies are foundational to the policy debate.
The Harvard Six Cities study, published by Dockery and colleagues in the New England Journal of Medicine in 1993, was the first major prospective cohort study to demonstrate a statistically significant association between ambient PM2.5 concentrations and all-cause mortality. Following 8,111 adults in six US cities for 14 to 16 years, the study found that residents of the most polluted city (Steubenville, Ohio) had a 26% higher all-cause mortality rate than residents of the least polluted city (Portage, Wisconsin) after controlling for age, sex, smoking, and other covariates. The estimated concentration-response relationship was approximately 1.4% increase in all-cause mortality per 10 μg/m³ increase in annual PM2.5. The Harvard Six Cities study generated intense controversy from industry, which challenged the results and demanded access to the underlying data; subsequent independent reanalysis confirmed the original findings.
The Pope et al. cohort studies, using the American Cancer Society Cancer Prevention Study II cohort of approximately 500,000 adults, extended the Six Cities findings to a nationally representative sample across multiple decades. The 2002 Pope analysis found a 6% increase in cardiopulmonary mortality and an 8% increase in lung cancer mortality per 10 μg/m³ increase in annual PM2.5. Subsequent analyses extended through 2015 and confirmed that the mortality associations persisted even as ambient concentrations declined from 1990s levels, implying that there is no safe threshold within the current regulatory range. EPA's Integrated Science Assessment for PM, the comprehensive scientific review that supports each NAAQS revision, incorporates the Six Cities, ACS, and dozens of subsequent cohort studies, meta-analyses, and time-series studies into the evidence base for its causal determination.
EPA quantifies the health benefits of NAAQS tightening using BenMAP— the Environmental Benefits Mapping and Analysis Program—a geospatial model that applies concentration-response functions from the epidemiological literature to the population distribution and projects avoided deaths, hospitalizations, and lost work days resulting from specified air quality improvements. BenMAP analyses conducted for the 2024 PM2.5 NAAQS revision estimated that reducing the annual standard from 12 to 9 μg/m³ would prevent approximately 4,200 to 9,900 premature deaths per year in the United States, with economic benefits valued between $37 billion and $88 billion annually. The attribution of roughly 100,000 or more premature deaths per year to PM2.5 exposure at current concentrations—a figure that has appeared in peer-reviewed literature and EPA regulatory analyses—reflects the aggregate burden across the full population exposure distribution rather than the incremental effect of just the nonattainment areas.
Environmental justice: monitoring gaps and pollution burden
The distribution of air quality monitoring infrastructure is not neutral with respect to race and income. Multiple independent analyses—using AQS monitor location data, Census demographic data, and EPA's EJScreen environmental justice screening tool—have documented that monitoring networks are systematically thinner in communities of color and low-income communities relative to predominantly white and higher-income communities.
A 2021 analysis published in Environmental Health Perspectives by Drs. Fowler and colleagues found that census tracts with higher proportions of Hispanic and Black residents were significantly less likely to have a PM2.5 monitor within a 10-kilometer radius than census tracts with similar pollution burden but predominantly white populations. Research by Casey et al. using satellite-derived PM2.5 estimates alongside AQS ground-truth data found that ground-based monitors underrepresent PM2.5 concentrations in high-minority census tracts relative to adjacent areas, which combined with the monitoring gap means that the regulatory data record may systematically underestimate pollution burden in environmental justice communities. The practical consequence is that communities bearing the highest pollution burden may also be the communities with the least regulatory data to demonstrate that burden and to support nonattainment designation or enforcement action.
EPA's EJScreen tool, available at epa.gov/ejscreen, provides census-tract-level environmental justice indices that combine pollution burden estimates with demographic data, including a PM2.5 exposure indicator derived from a combination of AQS monitor data and EPA's air quality modeling system. The EJScreen PM2.5 indicator is not the same as AQS monitor data—it incorporates modeled concentrations to fill geographic gaps—but it provides a more complete spatial picture of PM2.5 exposure distribution than AQS ground monitors alone. Cross-referencing EJScreen PM2.5 percentiles with census demographics confirms the pattern documented in the literature: communities at the top of the EJScreen PM2.5 distribution are disproportionately communities of color, particularly near major industrial corridors such as the Houston Ship Channel, the Ports of Los Angeles and Long Beach, Chicago's south and west sides, and the Mississippi River industrial corridor between Baton Rouge and New Orleans.
Wildfire smoke and exceptional events
The 2020–2024 period produced some of the most severe wildfire smoke events in recorded Western US history, with PM2.5 concentrations regularly reaching AQI levels above 200 and occasionally above 400 at monitors across California, Oregon, Washington, Idaho, and Montana. These events exposed a structural tension in the AQS regulatory framework: smoke-driven PM2.5 exceedances count against NAAQS attainment unless the state successfully petitions EPA to exclude the affected days under the exceptional events provisions of the Clean Air Act.
The exceptional events rule, codified at 40 CFR 50.14, allows states to flag monitoring data affected by events that are “not reasonably preventable” by the state or local authority—including wildfire smoke, high-wind dust events, and transported international pollution—and to petition EPA to exclude those flagged days from the NAAQS compliance calculation. The petition process requires the state to demonstrate that the event was exceptional (not typical), influenced the measured concentration, and was not reasonably attributable to the state's own emission sources. EPA must concur with the exclusion for it to take effect.
Wildfire exceptional event petitions have become routine for Western states. California, Oregon, and Washington file dozens of wildfire exclusion petitions annually. The aggregate effect is significant: removing wildfire days from the ozone attainment calculation can be the difference between attainment and nonattainment for counties in areas where wildfire smoke drives O3 precursor chemistry. Critics of the exceptional events provision argue that as wildfires become more frequent and severe due to climate change, an exception designed for rare natural events is being stretched to cover what is now a seasonal air quality crisis with substantial, preventable health impacts.
The AirNow Fire and Smoke Map, at fire.airnow.gov, provides a real-time display of PM2.5 concentrations from AQS monitors and PurpleAir sensors during wildfire events, integrated with satellite smoke plume imagery from NOAA's Hazard Mapping System. The map distinguishes smoke-affected monitoring readings from the background AQI display and is the primary public tool for communicating wildfire smoke health risk during active events. It uses EPA's PurpleAir correction algorithm to adjust low-cost sensor readings closer to FRM-equivalent values, substantially improving the accuracy of high-AQI smoke readings relative to uncorrected PurpleAir data.
Accessing AQS data: API, AirData, and parameter codes
The EPA AQS public API at aqs.epa.gov/data/api/ provides programmatic access to the full AQS database—monitor metadata, hourly data, daily summary data, and annual summary data—at no cost with a free registered account. The API uses simple HTTP GET requests with query parameters and returns JSON. The key endpoints for air quality research are:
dailyData/byState—returns daily summary records (arithmetic mean, AQI, maximum hour, completeness) for all monitors in a state for a given date range and parameter code. This is the primary endpoint for annual attainment analysis, trend research, and health impact studies. The parameter code88101 designates PM2.5 Local Conditions (FRM/FEM 24-hour samples);88502 designates PM2.5 non-FRM continuous monitors. For ozone, the parameter code is 44201.
sampleData/byState—returns individual sample measurements before daily aggregation. For filter-based PM2.5 monitors, this is a single 24-hour value per monitor per sampling day. For continuous ozone monitors, this is hourly values. This endpoint is appropriate for research requiring sub-daily temporal resolution or for computing custom aggregations.
annualData/byState—returns EPA's pre-computed annual summary statistics, including the design value (the three-year rolling average used for NAAQS attainment determination) and the 98th percentile 24-hour concentration (used for PM2.5 24-hr NAAQS evaluation). Using the pre-computed annual endpoint is considerably faster than computing annual statistics from daily data for multi-state or multi-year analyses.
For analysts who prefer file downloads over API queries, EPA's AirData download center at aqs.epa.gov/aqsweb/airdata/download_files.html provides annual pre-packaged CSV files of daily PM2.5, ozone, and other pollutant data organized by state or by CBSA (Core-Based Statistical Area). The AirData files include the regulatory-valid dataset with exceptional event flags applied— the same data that EPA uses for nonattainment determinations—and are updated annually after final QA review is complete.
Python: pulling daily PM2.5 from the AQS API
The following script authenticates to the AQS API, downloads all daily PM2.5 readings for California in 2023, computes the state-wide annual average, identifies days on which any monitor exceeded the 24-hour NAAQS of 35 μg/m³, and plots a monthly time series of the daily maximum. The script uses the 88101parameter code for FRM/FEM PM2.5 and filters out exceptional event–excluded readings to work with the regulatory baseline dataset. A free AQS API key is available at aqs.epa.gov/data/api/signup.
import requests
import pandas as pd
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
# EPA AQS Daily Summary API
# Docs: https://aqs.epa.gov/aqsweb/documents/data_api.html
# Register for a free API key at: https://aqs.epa.gov/data/api/signup
AQS_BASE = "https://aqs.epa.gov/data/api"
EMAIL = "your_email@example.com" # replace with your registered email
API_KEY = "your_api_key" # replace with your AQS API key
STATE = "06" # California FIPS code (2-digit, zero-padded)
PARAM = "88101" # PM2.5 FRM/FEM (parameter code for PM2.5 Local Conditions)
YEAR = 2023
# Endpoint: dailyData/byState
# Returns one row per monitor-day with arithmetic mean concentration
url = AQS_BASE + "/dailyData/byState"
params = {
"email": EMAIL,
"key": API_KEY,
"param": PARAM,
"bdate": str(YEAR) + "0101",
"edate": str(YEAR) + "1231",
"state": STATE,
}
print("Fetching AQS daily PM2.5 for state", STATE, "year", YEAR, "...")
resp = requests.get(url, params=params, timeout=120)
resp.raise_for_status()
payload = resp.json()
# AQS wraps results in {"Header": [...], "Data": [...]}
rows = payload.get("Data", [])
if not rows:
print("No data returned. Check email/key registration and parameter code.")
raise SystemExit(1)
df = pd.DataFrame(rows)
# Key columns: date_local, arithmetic_mean, aqi, county, site_number, sample_duration
df["date_local"] = pd.to_datetime(df["date_local"])
df["arithmetic_mean"] = pd.to_numeric(df["arithmetic_mean"], errors="coerce")
df["aqi"] = pd.to_numeric(df["aqi"], errors="coerce")
# Drop rows flagged as exceptional events excluded from regulatory calculations
# event_type values: "No Events", "Events Included", "Events Excluded",
# "Concurred Events Excluded"
# Keep "No Events" and "Events Included" for the regulatory baseline
df = df[df["event_type"].isin(["No Events", "Events Included", None])].copy()
df = df.dropna(subset=["arithmetic_mean"])
# ---- Annual average across all monitors in the state ----
annual_avg = round(df["arithmetic_mean"].mean(), 2)
print("State-wide daily mean PM2.5 arithmetic average:", annual_avg, "ug/m3")
print("(2024 NAAQS annual standard: 9.0 ug/m3)")
# ---- Daily state maximum (worst monitor each day) ----
daily_max = (
df.groupby("date_local")["arithmetic_mean"]
.max()
.reset_index(name="daily_max_pm25")
)
# ---- Identify exceedance days (daily 24-hr NAAQS = 35 ug/m3) ----
DAILY_NAAQS = 35.0
exc = daily_max[daily_max["daily_max_pm25"] > DAILY_NAAQS].copy()
print("Days with at least one monitor exceeding 35 ug/m3:", len(exc))
if len(exc) > 0:
worst = exc.nlargest(5, "daily_max_pm25")[["date_local", "daily_max_pm25"]]
print("Five worst days:")
for _, row in worst.iterrows():
print(" ", str(row["date_local"])[:10],
" ", round(row["daily_max_pm25"], 1), "ug/m3")
# ---- Monthly average for time-series plot ----
daily_max["month"] = daily_max["date_local"].dt.to_period("M")
monthly = daily_max.groupby("month")["daily_max_pm25"].mean().reset_index()
monthly["month_dt"] = monthly["month"].dt.to_timestamp()
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(monthly["month_dt"], monthly["daily_max_pm25"],
marker="o", linewidth=1.5, color="#0b4a8f", label="Monthly avg daily max PM2.5")
ax.axhline(DAILY_NAAQS, color="#cc3300", linestyle="--", linewidth=1,
label="24-hr NAAQS (35 ug/m3)")
ax.axhline(9.0, color="#cc8800", linestyle=":", linewidth=1,
label="Annual NAAQS 2024 (9 ug/m3)")
ax.set_title("California PM2.5 daily max by month (" + str(YEAR) + ")")
ax.set_xlabel("Month")
ax.set_ylabel("PM2.5 (ug/m3)")
ax.legend(fontsize=8)
fig.tight_layout()
fig.savefig("ca_pm25_" + str(YEAR) + ".png", dpi=150)
print("Plot saved to ca_pm25_" + str(YEAR) + ".png")
Several implementation notes are worth calling out. The AQS API returns data with an event_type field that distinguishes measurements where no exceptional events were flagged, where events were flagged but included in regulatory calculations, and where events were flagged and excluded. Filtering to “No Events” and “Events Included” produces the regulatory baseline comparable to EPA's attainment determinations; retaining “Events Excluded” rows allows analysis of wildfire and dust event impacts on observed concentrations. The API returns data at the individual monitor level, so state-wide analysis requires deciding whether to aggregate across monitors (giving equal weight to each monitoring site, regardless of the population it represents) or to weight by the represented population (which requires joining AQS monitor coordinates to Census population data). For regulatory purposes, attainment is determined at the site level: a single monitor with a design value above the NAAQS standard can place an entire county in nonattainment.
The time series plot will typically reveal seasonal PM2.5 peaks in California: summer and fall spikes driven by wildfire smoke, a winter peak from wood burning and atmospheric inversions in the Central Valley, and a spring minimum as precipitation scours the atmosphere. Days exceeding 35 μg/m³ are almost always wildfire-driven in recent years, with the annual count in affected counties directly determining whether an exceptional events petition is warranted to preserve attainment status.
Related writing
CDC WONDER: The Federal Mortality Database Behind Every Cause-of-Death Analysis — PM2.5 attributable deaths visible in AQS data connect to the ICD-10 cardiovascular and respiratory mortality records in CDC WONDER; this piece explains the death certificate pipeline, ICD-10 coding, and age-adjusted rate methodology behind the health burden estimates.
NHTSA FARS: The Federal Census of Every US Traffic Fatality Since 1975 — Motor vehicles are the dominant source of NOx and CO in urban airsheds and a major PM2.5 precursor emitter; FARS tracks the human cost of the traffic system that generates a large share of urban criteria pollutant burden.
HHS Medicaid Enrollment: State-Level Coverage Data and the ACA Expansion Gap — Air pollution's health burden falls disproportionately on low-income communities; Medicaid enrollment data provides the health coverage lens on the same environmental justice populations that EPA's AQS monitoring gaps leave most poorly served.