Technical writing
DEA ARCOS: The Federal Opioid Distribution Database Behind 380 Million Pill Shipment Transactions
For eight years, a federal database called ARCOS silently recorded every opioid pill shipped in America — every transaction from manufacturer to distributor to pharmacy, 380 million records in total. The Drug Enforcement Administration kept the data secret. Then the opioid litigation forced it open, and the numbers revealed the architecture of a public health catastrophe: 76 billion oxycodone and hydrocodone pills distributed between 2006 and 2014, routed through three distributors that controlled nearly half the market, flowing in concentrations that amounted to hundreds of pills per resident per year in the hardest-hit counties of West Virginia, Ohio, and Kentucky.
This article covers the statutory and regulatory basis for ARCOS reporting, what the database tracks and how the transaction records are structured, the 2019 federal court order in MDL 2804 that forced the first public release of transaction-level opioid data, the Washington Post publication that made the data searchable by any American, the key findings on pill volumes and geographic concentration, the suspicious order monitoring failures that allowed the distribution network to operate unchecked, the policy response and $55 billion in aggregate opioid litigation settlements, and a Python script that downloads the Washington Post ARCOS bulk data and computes county-level pills-per-capita for oxycodone and hydrocodone.
Statutory and regulatory basis
ARCOS — Automation of Reports and Consolidated Orders System — is a DEA mandatory reporting program established under the Controlled Substances Act. The statutory basis is 21 U.S.C. §827, which requires every person registered to manufacture, distribute, or dispense controlled substances to maintain records and make reports as the Attorney General prescribes. The implementing regulations appear at 21 CFR Part 1304 (records and reports of registrants) and 21 CFR §1304.33 specifically, which mandates ARCOS reporting for Schedule I and II controlled substances.
Under 21 CFR §1304.33, every manufacturer, distributor, and importer of Schedule I and II controlled substances must report to ARCOS each transaction involving those substances — every sale, purchase, return, loss, theft, and surrender — on a quarterly basis. The requirement covers the complete commercial chain above the retail pharmacy level: a manufacturer selling to a distributor must report; the distributor selling to a pharmacy must report; if the pharmacy returns unsold product to the distributor, that transaction is also reported. Retail pharmacies are the terminal reporting obligation; they report to state prescription drug monitoring programs (PDMPs) rather than directly to ARCOS, but DEA can access PDMP data through cooperative agreements.
The controlled substances subject to ARCOS reporting include the full Schedule I and II list, but the opioid crisis litigation focused on the Schedule II opioids: oxycodone (OxyContin, Percocet), hydrocodone (Vicodin, Norco), fentanyl, morphine, codeine combination products, hydromorphone (Dilaudid), methadone, oxymorphone (Opana), and buprenorphine. Schedule II stimulants — amphetamine (Adderall) and methamphetamine — are also tracked. For the opioid MDL litigation, the relevant period was 2006 through 2014, the years for which DEA was ordered to produce transaction-level data.
What ARCOS tracks: the transaction record structure
Each ARCOS transaction record captures a single shipment or movement event. The core fields are: the reporter DEA registration number and business name and address; the buyer DEA registration number and business name and address; the drug code (DEA's internal numeric code for the controlled substance) and drug name; the National Drug Code (NDC) identifying the specific product formulation; the quantity in base-weight grams; the dosage unit count (individual pills, patches, or vials); the transaction date; and the transaction code.
Transaction codes distinguish the type of movement. The codes used in the litigation-released data include: S (sale or distribution), P (purchase or receipt), T (theft or significant loss), R (return of product), and X (surrender to DEA). The S transactions — sales from distributor to pharmacy — are the primary analytical focus, as they represent the quantity of opioids flowing to end dispensing points. The reporter-buyer structure means each S transaction at the distributor-to-pharmacy level has a corresponding P transaction on the pharmacy side, creating a double-entry record system that DEA can use to reconcile volumes and detect discrepancies.
The DEA registration number (DEA number) is the key identifier linking reporters and buyers across transactions. Every DEA registrant — manufacturer, distributor, pharmacy, hospital, practitioner — is assigned a unique DEA number that encodes the registrant type, last name initial, and a checksum. The ARCOS transaction records allow reconstruction of the complete chain of custody for any pill: which manufacturer produced the active pharmaceutical ingredient, which labeler packaged it, which wholesale distributor shipped it to which pharmacy on which date in what quantity. This linkage is what made the litigation data so powerful — it connected Purdue Pharma's oxycodone production through McKesson's distribution network to the specific pharmacies in Mingo County, West Virginia, that dispensed the pills.
The litigation release: MDL 2804 and the 2019 court order
ARCOS transaction-level data was classified as law enforcement sensitive and had never been publicly released. Researchers, journalists, and state attorneys general had known of its existence but had been unable to obtain it through Freedom of Information Act requests, which DEA denied on law enforcement exemption grounds. The opioid litigation changed the calculus.
In re: National Prescription Opiate Litigation, MDL 2804, is the consolidated federal multidistrict litigation handling opioid claims from thousands of municipalities, counties, Native American tribes, hospitals, and third-party payors against manufacturers and distributors. The MDL was assigned to the Honorable Dan Aaron Polster in the United States District Court for the Northern District of Ohio. In July 2019, Judge Polster ordered the DEA to produce ARCOS transaction-level data for 2006 through 2014. The Washington Post and HD Media (publisher of newspapers in West Virginia including the Charleston Gazette-Mail, which had won a Pulitzer Prize for its opioid reporting) intervened and obtained an order permitting public release of the data.
The Washington Post published the ARCOS data in July 2019 alongside a suite of investigative articles. The Post built a searchable database allowing any American to look up the volume of opioid pills shipped to any pharmacy or county in the United States between 2006 and 2014. The database was accessible at washingtonpost.com/graphics/2019/investigations/dea-pain-pill-database/ and included county-level and pharmacy-level drill-down. HD Media simultaneously published a West Virginia–focused interface. The publication set off the next wave of litigation settlements, as the transaction-level data provided plaintiffs with granular evidence of specific shipments to specific pharmacies that distributors had certified were legitimate.
The Post also made the raw bulk data files available for download, organized by state, as gzip-compressed tab-separated files. This was the first time any member of the public could analyze the complete ARCOS transaction record for a state or county. Researchers at universities and public health agencies immediately began publishing analyses; congressional staffers used the data in briefings; state attorneys general used it in ongoing litigation; and journalists at local news organizations produced county-by-county accountability reporting that had been impossible before.
Scale of the opioid distribution network: 76 billion pills
The headline figure from the ARCOS data is stark: 76 billion oxycodone and hydrocodone pills were shipped to pharmacies and practitioners in the United States between 2006 and 2014. That is approximately 230 pills for every American alive in 2010. The 380 million individual transaction records document how those pills moved through the supply chain.
The geographic concentration of opioid distribution was extreme. West Virginia, the state with the highest opioid mortality rate in the country, received shipments averaging roughly 780 pills per resident per year over the 2006–2014 period. Individual counties were far worse. Mingo County, West Virginia — population approximately 25,000 — received 3.3 million hydrocodone pills over two years, equivalent to more than 66 pills per resident per year from hydrocodone alone, on top of comparable oxycodone volumes. McDowell County, Wyoming County, and Logan County in West Virginia showed similar patterns. Norton, Virginia — population 3,900 — had a single pharmacy, Hurley Drug Company, that received 10 million opioid pills between 2006 and 2016. Some small pharmacies in West Virginia received shipments of controlled substances that would have required every resident of their county to be taking opioids continuously.
The data revealed which manufacturers produced the pills flowing into the hardest-hit communities. Purdue Pharma, maker of OxyContin (extended-release oxycodone), was the most prominent but not the largest by volume. Mallinckrodt Pharmaceuticals was the largest generic oxycodone and hydrocodone manufacturer by dosage unit volume, producing under the SpecGx brand. Actavis (now part of Teva Pharmaceutical) and Amneal Pharmaceuticals were also significant generic producers. Purdue's OxyContin had the highest average price per pill but Mallinckrodt's generic oxycodone, at a fraction of the cost, achieved far higher volume.
The Big Three distributors and market concentration
Three wholesale pharmaceutical distributors — McKesson Corporation, Cardinal Health, and AmerisourceBergen (now Cencora) — collectively handled approximately 44% of all opioid distribution in the United States over the ARCOS period. These companies are among the largest corporations in the United States by revenue; all three rank in the top 15 of the Fortune 500. Their core business is pharmaceutical distribution: buying drugs from manufacturers at wholesale prices and delivering them to pharmacies, hospitals, and other dispensing points on just-in-time schedules. The opioid business was a significant revenue stream for all three throughout the 2006–2014 period.
The ARCOS data allowed plaintiffs to reconstruct each distributor's shipment volumes to specific pharmacies. In the hardest-hit communities, the transaction records showed McKesson, Cardinal Health, and AmerisourceBergen repeatedly shipping quantities to small-town pharmacies that would have required extraordinary per-capita consumption to explain as legitimate medical use. A pharmacy serving 1,500 customers receiving 500,000 hydrocodone pills per year — a documented pattern in multiple West Virginia pharmacies — implies either that every resident was a chronic pain patient receiving near-maximum dosing, or that a substantial fraction of the product was being diverted to illegal use. The distributors had the ARCOS data themselves; they were required to monitor it under their DEA registrations.
Suspicious order monitoring: the regulatory failure
The regulatory requirement that distributors should have caught these patterns is codified at 21 CFR §1301.74(b). That regulation requires every DEA registrant who distributes controlled substances to design and operate a system to disclose to the registrant suspicious orders of controlled substances, and to inform the field division office of the DEA of suspicious orders when discovered. The regulation defines suspicious orders as those of unusual size, those deviating substantially from a normal pattern, and those of unusual frequency.
The ARCOS data and subsequent litigation revealed that the Big Three had failed to implement adequate suspicious order monitoring systems despite documented knowledge of the diversion problem. Internal McKesson documents introduced as evidence in multiple litigations showed that compliance personnel had flagged numerous pharmacies as suspicious but that business-unit managers had overridden or delayed reporting. Cardinal Health's suspicious order monitoring system, investigators found, used algorithms with thresholds so high that virtually no order ever triggered a report to DEA. AmerisourceBergen similarly failed to halt shipments to pharmacies that its own systems identified as anomalous.
DEA enforcement actions established the regulatory violations. In 2017, McKesson paid $150 million in civil penalties and agreed to suspend controlled substance distribution operations at multiple distribution centers — the largest settlement with a drug distributor at the time. AmerisourceBergen paid $150 million to settle related DEA and DOJ allegations in 2017. Cardinal Health paid $44 million in 2016. These settlements did not resolve the broader opioid litigation; they addressed only the specific DEA regulatory violations and were later credited against the larger MDL settlements.
DEA itself has faced sustained criticism for its role in enabling the distribution of unprecedented opioid volumes. A 2016 Washington Post/60 Minutes joint investigation revealed that Congress, lobbied by the drug industry, had enacted the Ensuring Patient Access and Effective Drug Enforcement Act of 2016, which made it significantly harder for DEA to immediately suspend a drug company's DEA registration even when it found an imminent public health danger. The Act passed both chambers with minimal opposition and was signed by President Obama. A subsequent Senate investigation found that DEA leadership had approved the bill despite opposition from career enforcement agents. The bill effectively hamstrung DEA's most powerful enforcement tool during the peak years of the opioid crisis.
Purdue Pharma, the Sackler family, and Mallinckrodt
Purdue Pharma LP launched OxyContin in 1996 with a marketing campaign that proved as consequential as any pharmaceutical launch in American history. The drug was extended-release oxycodone, and Purdue marketed it aggressively to primary care physicians on the claim — later found by multiple courts to be misleading — that the extended-release formulation reduced addiction risk relative to immediate-release opioids. Purdue paid bonuses to sales representatives based on OxyContin prescription volume and funded continuing medical education programs promoting pain management with opioids. By the early 2000s, OxyContin was generating more than $1 billion annually.
In 2007, Purdue Pharma and three of its executives pleaded guilty to federal felony charges of misbranding OxyContin and paid $634 million in fines — at the time the largest such settlement in a case involving a pharmaceutical company. The plea did not stop the sales. OxyContin revenues continued to grow through 2010, when Purdue reformulated the tablet to resist crushing and dissolving for injection or inhalation. That reformulation contributed to users shifting to heroin and illicitly manufactured fentanyl, accelerating the second and third waves of the opioid crisis. Purdue filed for Chapter 11 bankruptcy in September 2019. A 2020 settlement valued at $8.34 billion resolved federal criminal and civil liability, with Purdue pleading guilty to three federal felonies. The Sackler family, which owned Purdue privately, agreed to pay $6 billion to states and other claimants, initially in exchange for broad legal immunity — a provision that the Supreme Court struck down in Harrington v. Purdue Pharma (2024), requiring renegotiation of the settlement.
Mallinckrodt Pharmaceuticals, headquartered in Ireland with U.S. operations in St. Louis, was the largest generic opioid manufacturer by volume during the ARCOS period. Its SpecGx subsidiary produced the generic oxycodone and hydrocodone that flooded rural pharmacies in Appalachia and the Midwest. Mallinckrodt filed for Chapter 11 bankruptcy in October 2020 and reached a settlement of $1.6 billion, paid over eight years, to resolve opioid liability claims from states and municipalities. Mallinckrodt filed for bankruptcy a second time in 2023.
Geographic patterns and the pills-per-capita metric
The pills-per-capita metric — total dosage units shipped divided by county or state population — became the standard summary statistic for measuring opioid distribution intensity after the ARCOS release. It is a blunt instrument: it does not account for the age structure of the population, the prevalence of chronic pain conditions, or legitimate variation in prescribing patterns. But at the extremes documented in the ARCOS data, the metric is analytically unambiguous. A county receiving 1,000 opioid pills per resident per year cannot plausibly be serving legitimate medical need; the population would need to be universally prescribed chronic opioid therapy at maximum dosing to consume that volume. The pills-per-capita figures in Mingo, McDowell, and Wyoming counties in West Virginia were in that range.
The geographic concentration of distribution followed two overlapping patterns. The first was rural Appalachia: counties in West Virginia, eastern Kentucky, southwest Virginia, and eastern Tennessee had high per-capita distribution driven by a combination of high rates of workers with chronic pain injuries from coal mining and physical labor, underinsured populations with limited access to non-opioid pain management, and the specific geography of pill mills — pain clinics that dispensed opioids with minimal clinical gatekeeping. The second pattern was suburban and exurban Ohio, Pennsylvania, and the industrial Midwest: higher absolute volumes reflecting population density, with pills-per-capita ratios lower than Appalachia but still vastly exceeding any plausible therapeutic need. Cuyahoga County (Cleveland) and Summit County (Akron) in Ohio filed two of the bellwether cases in MDL 2804.
The Washington Post database and public access
The Washington Post's searchable interface, launched in July 2019, allowed anyone to enter a county name or a pharmacy's name and address and retrieve the total opioid pills shipped there between 2006 and 2014. The interface was built on top of a database derived from the litigation-released ARCOS files and supplemented with the Post's reporting. Users could see totals by year, by drug, by distributor, and by manufacturer. County-level data was precomputed and displayed with population context. The database received enormous traffic upon launch and has remained one of the most-cited sources in opioid policy reporting.
The Post made the underlying bulk data files publicly downloadable, organized by state. Each file is a tab-separated, gzip-compressed file containing the complete transaction records for that state. File sizes vary by state population and opioid distribution intensity; the West Virginia and Ohio files are among the larger ones. The files include all fields from the ARCOS transaction records plus several Post-computed fields: the morphine milligram equivalent (MME) conversion factor for each drug, the computed MME for each transaction, the product name and ingredient name from the NDC database, and the combined labeler name. These computed fields significantly reduce the preprocessing burden for analysis.
DEA does not independently publish ARCOS data. Outside the litigation release and the Washington Post publication, the primary routes to ARCOS data are: FOIA requests to DEA (which have typically been denied on law enforcement exemption grounds), state attorney general offices that obtained data as part of their opioid litigation and in some cases have published it, DEA's published annual drug diversion statistics (which contain aggregate totals, not transaction-level detail), and secondary analyses published by NIDA, NIH, and academic researchers using the litigation data. The Washington Post bulk download remains the most accessible source of transaction-level data for the 2006–2014 period.
Policy response and opioid settlements
The federal legislative response to the opioid crisis included the Comprehensive Addiction and Recovery Act of 2016 (CARA) and, more significantly, the SUPPORT for Patients and Communities Act of 2018, which represented the largest federal legislative package on addiction and opioids since the 1970s. SUPPORT Act provisions included: expanding Medicare and Medicaid coverage for substance use disorder treatment; authorizing DEA to allow pharmacies to dispense buprenorphine to opioid-dependent individuals without a separate DEA registration; directing SAMHSA to expand the opioid treatment program framework; requiring prescribers to complete training on opioid prescribing; and adding requirements for opioid packaging limits on initial prescriptions for acute pain. The Act was enacted with broad bipartisan support.
DEA published enhanced suspicious order monitoring regulations effective 2021, requiring distributors to use more sophisticated algorithms and to maintain detailed documentation of their monitoring decisions. The new rules, codified at 21 CFR Part 1301, require distributors to conduct due diligence before completing shipments to customers that trigger suspicious order indicators, and to document the basis for proceeding or halting. The 2021 rules were more explicit than the original 1971 regulatory language and were intended to prevent recurrence of the monitoring failures that the ARCOS data had exposed.
The aggregate financial resolution of the opioid litigation totaled more than $55 billion. The Big Three distributor settlement — McKesson, Cardinal Health, and AmerisourceBergen — reached $21 billion to be paid over 18 years, announced in 2021 and finalized in 2022. Johnson & Johnson, which manufactured the active pharmaceutical ingredients for many opioids through its Janssen subsidiary and which had specifically marketed extended-release opioids to physicians, settled for $5 billion over nine years. Purdue Pharma's $8.34 billion federal settlement and the Sackler family's $6 billion individual contributions are the largest component attributable to a single manufacturer. Endo International (Percocet manufacturer) settled for $603 million through its bankruptcy. Walgreens settled for $5.7 billion in 2022; CVS settled for $5 billion; Walmart settled for $3.1 billion. Virtually all of the settlement funds are directed to states and localities for opioid treatment, recovery programs, harm reduction, and prevention.
Python: downloading and analyzing the Washington Post ARCOS data
The following script downloads the Washington Post ARCOS bulk data file for West Virginia (or any state), filters to opioid sale transactions, aggregates dosage units by buyer county and drug name, computes pills-per-capita using 2010 Census population, and prints the top 10 counties by pills-per-capita for oxycodone and hydrocodone. The script also identifies the largest distributors by total pills shipped and produces an annual trend table. Requirements: requestsand pandas.
import requests
import pandas as pd
import io
# ---------------------------------------------------------------------------
# DEA ARCOS Opioid Distribution Analysis
# Washington Post bulk download endpoints (published 2019 via MDL 2804 release)
# ---------------------------------------------------------------------------
# The Washington Post published state-level ARCOS bulk CSVs at the URL pattern
# below. Each file contains transaction-level pill shipment records for one
# state, 2006-2014.
#
# Fields: REPORTER_DEA_NO, REPORTER_NAME, REPORTER_ADDL_CO_INFO,
# REPORTER_ADDRESS1, REPORTER_CITY, REPORTER_STATE, REPORTER_ZIP,
# REPORTER_COUNTY, BUYER_DEA_NO, BUYER_NAME, BUYER_ADDL_CO_INFO,
# BUYER_ADDRESS1, BUYER_CITY, BUYER_STATE, BUYER_ZIP,
# BUYER_COUNTY, TRANSACTION_CODE, DRUG_CODE, NDC_NO, DRUG_NAME,
# QUANTITY, UNIT, ACTION_INDICATOR, ORDER_FORM_NO, CORRECTION_NO,
# STRENGTH, TRANSACTION_DATE, CALC_BASE_WT_IN_GM, DOSAGE_UNIT,
# TRANSACTION_ID, Product_Name, Ingredient_Name, Measure, MME_Conversion_Factor,
# Combined_Labeler_Name, Revised_Company_Name, Reporter_family,
# dos_str, MME
#
# TRANSACTION_CODE: S = Sale/distribution, P = Purchase/receipt,
# T = Theft/loss, R = Return, X = Surrender to DEA
BASE_URL = "https://www.washingtonpost.com/wp-stat/data/dea-pain-pill-database/bulk/"
# State FIPS codes (two-letter abbreviation used in WaPo URL)
STATE = "WV" # West Virginia; swap for OH, KY, TN, PA, etc.
url = f"{BASE_URL}arcos-{STATE.lower()}-statewide-itemized.tsv.gz"
print(f"Downloading ARCOS data for {STATE} from Washington Post...")
print(f"URL: {url}")
resp = requests.get(url, timeout=300, stream=True)
resp.raise_for_status()
# Read gzip-compressed TSV directly into pandas
df = pd.read_csv(
io.BytesIO(resp.content),
sep="\t",
compression="gzip",
low_memory=False,
dtype={
"REPORTER_ZIP": str,
"BUYER_ZIP": str,
"BUYER_COUNTY": str,
},
)
print(f"Loaded {len(df):,} transaction records for {STATE}")
print(f"Date range: {df['TRANSACTION_DATE'].min()} to {df['TRANSACTION_DATE'].max()}")
print(f"Unique drug names: {sorted(df['DRUG_NAME'].dropna().unique())}")
# ---------------------------------------------------------------------------
# Filter to opioid pill shipments (sales, S code) and key drugs
# ---------------------------------------------------------------------------
OPIOID_DRUGS = {
"OXYCODONE",
"HYDROCODONE",
"FENTANYL",
"MORPHINE",
"HYDROMORPHONE",
"OXYMORPHONE",
"METHADONE",
"CODEINE",
"BUPRENORPHINE",
}
df_sales = df[df["TRANSACTION_CODE"] == "S"].copy()
df_opioids = df_sales[
df_sales["DRUG_NAME"].str.upper().isin(OPIOID_DRUGS)
].copy()
print(f"\nSale transactions for tracked opioids: {len(df_opioids):,}")
# Parse transaction year
df_opioids["YEAR"] = pd.to_datetime(
df_opioids["TRANSACTION_DATE"], format="%m%d%Y", errors="coerce"
).dt.year
# County-level aggregation: sum DOSAGE_UNIT (pill count) by county and drug
county_drug = (
df_opioids
.groupby(["BUYER_COUNTY", "DRUG_NAME"])["DOSAGE_UNIT"]
.sum()
.reset_index()
.rename(columns={"DOSAGE_UNIT": "total_pills"})
)
# ---------------------------------------------------------------------------
# Census 2010 population for West Virginia counties (ACS 5-year estimate)
# Hardcoded for reproducibility; swap with Census API call for other states.
# ---------------------------------------------------------------------------
WV_POP_2010 = {
"BARBOUR": 16589, "BERKELEY": 104169, "BOONE": 24629, "BRAXTON": 14523,
"BROOKE": 23875, "CABELL": 96319, "CALHOUN": 7627, "CLAY": 9386,
"DODDRIDGE": 8202, "FAYETTE": 46039, "GILMER": 8693, "GRANT": 11937,
"GREENBRIER": 35480, "HAMPSHIRE": 23964, "HANCOCK": 30676,
"HARDY": 14025, "HARRISON": 68652, "JACKSON": 29211, "JEFFERSON": 53498,
"KANAWHA": 193063, "LEWIS": 16372, "LINCOLN": 21720, "LOGAN": 36743,
"MCDOWELL": 22113, "MARION": 56418, "MARSHALL": 33107, "MASON": 27324,
"MERCER": 62264, "MINERAL": 28212, "MINGO": 26839, "MONONGALIA": 96189,
"MONROE": 13502, "MORGAN": 17541, "NICHOLAS": 26233, "OHIO": 44443,
"PENDLETON": 7695, "PLEASANTS": 7605, "POCAHONTAS": 8719, "PRESTON": 33520,
"PUTNAM": 55486, "RALEIGH": 78859, "RANDOLPH": 29405, "RITCHIE": 10449,
"ROANE": 14926, "SUMMERS": 13927, "TAYLOR": 16895, "TUCKER": 7141,
"TYLER": 9208, "UPSHUR": 24254, "WAYNE": 42481, "WEBSTER": 9154,
"WETZEL": 16583, "WIRT": 5717, "WOOD": 86956, "WYOMING": 23796,
}
pop_df = pd.DataFrame(
list(WV_POP_2010.items()), columns=["BUYER_COUNTY", "population"]
)
# Normalize county name capitalization
county_drug["BUYER_COUNTY"] = county_drug["BUYER_COUNTY"].str.upper().str.strip()
pop_df["BUYER_COUNTY"] = pop_df["BUYER_COUNTY"].str.upper().str.strip()
# Merge population
county_drug = county_drug.merge(pop_df, on="BUYER_COUNTY", how="left")
county_drug["pills_per_capita"] = (
county_drug["total_pills"] / county_drug["population"]
)
# ---------------------------------------------------------------------------
# Top 10 counties by pills-per-capita: Oxycodone
# ---------------------------------------------------------------------------
oxy_top10 = (
county_drug[county_drug["DRUG_NAME"].str.upper() == "OXYCODONE"]
.sort_values("pills_per_capita", ascending=False)
.head(10)
)
print("\n=== Top 10 WV Counties: Oxycodone Pills Per Capita (2006-2014) ===")
print(f" {'County':<20} {'Total Pills':>14} {'Population':>12} {'Pills/Capita':>14}")
print(" " + "-" * 64)
for _, row in oxy_top10.iterrows():
print(
f" {row['BUYER_COUNTY'].title():<20} "
f"{row['total_pills']:>14,.0f} "
f"{row['population']:>12,.0f} "
f"{row['pills_per_capita']:>14.1f}"
)
# ---------------------------------------------------------------------------
# Top 10 counties by pills-per-capita: Hydrocodone
# ---------------------------------------------------------------------------
hydro_top10 = (
county_drug[county_drug["DRUG_NAME"].str.upper() == "HYDROCODONE"]
.sort_values("pills_per_capita", ascending=False)
.head(10)
)
print("\n=== Top 10 WV Counties: Hydrocodone Pills Per Capita (2006-2014) ===")
print(f" {'County':<20} {'Total Pills':>14} {'Population':>12} {'Pills/Capita':>14}")
print(" " + "-" * 64)
for _, row in hydro_top10.iterrows():
print(
f" {row['BUYER_COUNTY'].title():<20} "
f"{row['total_pills']:>14,.0f} "
f"{row['population']:>12,.0f} "
f"{row['pills_per_capita']:>14.1f}"
)
# ---------------------------------------------------------------------------
# Top distributors by total opioid dosage units shipped
# ---------------------------------------------------------------------------
top_reporters = (
df_opioids
.groupby("REPORTER_NAME")["DOSAGE_UNIT"]
.sum()
.sort_values(ascending=False)
.head(15)
.reset_index()
.rename(columns={"DOSAGE_UNIT": "total_pills_shipped"})
)
print(f"\n=== Top 15 Opioid Distributors in {STATE} by Pills Shipped ===")
print(f" {'Distributor':<45} {'Pills Shipped':>16}")
print(" " + "-" * 64)
for _, row in top_reporters.iterrows():
print(f" {row['REPORTER_NAME']:<45} {row['total_pills_shipped']:>16,.0f}")
# ---------------------------------------------------------------------------
# Annual trend: total opioid pills shipped by year
# ---------------------------------------------------------------------------
annual = (
df_opioids
.groupby("YEAR")["DOSAGE_UNIT"]
.sum()
.reset_index()
.rename(columns={"DOSAGE_UNIT": "pills"})
.sort_values("YEAR")
)
print(f"\n=== {STATE} Total Opioid Pills Shipped by Year ===")
state_pop = sum(WV_POP_2010.values())
print(f" {'Year':<6} {'Total Pills':>16} {'Pills Per Capita':>18}")
print(" " + "-" * 44)
for _, row in annual.iterrows():
if pd.notna(row["YEAR"]) and row["YEAR"] >= 2006:
ppc = row["pills"] / state_pop
bar = "#" * int(ppc / 20)
print(f" {int(row['YEAR']):<6} {row['pills']:>16,.0f} {ppc:>18.1f} {bar}")
The Washington Post bulk files are large; the West Virginia file is several hundred megabytes compressed. The script reads the gzip stream directly into pandas without writing to disk. For multi-state analysis, loop the download over state abbreviations and concatenate the resulting DataFrames. TheDOSAGE_UNIT field counts individual pills (tablets, capsules) or patches; it is the appropriate numerator for pills-per-capita calculations. TheCALC_BASE_WT_IN_GM field gives the active ingredient weight in grams; multiply by the MME_Conversion_Factor column to compute morphine milligram equivalents for dose-equivalency analysis across different opioids.
Data limitations and research notes
The ARCOS data released via MDL 2804 covers 2006 through 2014 only. It does not include data from 2015 onward, when the composition of the opioid crisis shifted substantially: heroin and illicitly manufactured fentanyl became the dominant drivers of overdose mortality, while prescription opioid distribution declined following regulatory and prescribing-practice changes. The 2006–2014 window captures the peak of prescription opioid diversion but does not document the subsequent synthetic opioid wave that drove overdose deaths above 80,000 per year by the early 2020s.
The ARCOS records cover manufacturer-to-distributor and distributor-to-pharmacy transactions; they do not contain patient-level dispensing records. Patient-level data is held in state PDMPs, which vary in coverage, real-time accessibility, and interstate interoperability. The federal PDMP Interstate Data Sharing initiative has improved cross-state visibility since 2016, but PDMP data remains state-held and research access requires individual state data-sharing agreements. DEA has access to PDMP data under the Omnibus Crime Control and Safe Streets Act and cooperative agreements with states, but that data is not part of the litigation release.
The pills-per-capita metric overstates distribution concentration in counties that serve as regional pharmacy hubs — where patients from multiple counties fill prescriptions at a single pharmacy. Large regional pharmacy chains in relatively populous counties may show high total distribution volumes that reflect a catchment area larger than their home county. For analyses requiring catchment-area adjustment, the buyer ZIP code field allows matching to ZIP code tabulation area (ZCTA) population estimates, which provide finer geographic granularity than county-level analysis.
The arcos R package, published on CRAN, provides a programmatic interface to the Washington Post ARCOS data for R users, including functions for downloading state-level and pharmacy-level data, computing per-capita metrics, and joining to Census geographic data. The package was developed by journalists and researchers who worked with the Post data after the 2019 release and substantially reduces the data processing burden for R-based analysis.