Every year the Centers for Medicare & Medicaid Services publishes a dataset showing exactly what Medicare paid each hospital for each type of inpatient case—broken down to the Diagnosis Related Group level. Roughly 3,000 hospitals, 760 DRG codes, and the full charge-versus-payment picture for every combination. The data reveals that a hospital in California might charge $80,000 for a knee replacement while Medicare pays $16,000, and that an identical procedure at a hospital two states away carries a Medicare payment of $32,000. Understanding this dataset means understanding how the federal government actually pays for hospital care.
Program overview
The CMS Medicare Inpatient Provider Charge Data—formally the Inpatient Prospective Payment System (IPPS) Provider Summary—is published annually by the Centers for Medicare & Medicaid Services and is available without restriction on data.cms.gov. Each annual file covers discharges from a single federal fiscal year (October 1 through September 30) and is typically released roughly 18 months after the close of that year, once provider cost reports are settled.
The dataset covers all Medicare-certified hospitals that participate in the IPPS and had at least 11 discharges for a given DRG during the reporting year. The 11-discharge threshold is the CMS privacy floor; combinations below it are suppressed. In practice this means the dataset captures the large majority of Medicare inpatient volume but omits very low-volume hospital–DRG pairs. The most recent full releases cover approximately 3,000 hospitals and 760 DRG codes. Not every hospital reports every DRG—a small community hospital may appear for 40 or 50 DRGs while a large academic medical center appears for 600 or more.
Each row in the dataset represents a unique combination of hospital and DRG, with five key numeric fields: total discharges during the year, average covered charges (the hospital's chargemaster price), average total payments (all payers combined, including Medicare, secondary insurers, and patient cost-sharing), and average Medicare payments (what Medicare actually paid the hospital net of patient deductibles and coinsurance). The hospital is identified by its NPI (National Provider Identifier), name, address, city, state, zip code, and Hospital Referral Region—the geographic market unit defined by Dartmouth Atlas methodology.
The DRG system and IPPS payment formula
Medicare pays hospitals for inpatient stays under the Inpatient Prospective Payment System, introduced in 1983 as a replacement for cost-based reimbursement. Under the old cost-based system, hospitals were reimbursed for whatever they spent on a case. IPPS replaced that with a fixed payment per discharge, determined entirely by the patient's DRG assignment, regardless of the hospital's actual costs.
A Diagnosis Related Group is a clinical category that groups inpatient cases with similar resource consumption profiles. The MS-DRG (Medicare Severity DRG) system, in use since 2008, defines approximately 760 groups and subdivides many conditions into three tiers based on complication and comorbidity severity: without complications or comorbidities (w/o CC/MCC), with complications or comorbidities (w/ CC), and with major complications or comorbidities (w/ MCC). A patient hospitalized with a hip fracture and no significant comorbidities lands in a different DRG than an otherwise identical patient with congestive heart failure or kidney disease, and Medicare pays substantially more for the more complex case.
The IPPS payment formula is:
Payment = Base Rate × Relative Weight × Adjustments
The base rate is set annually by CMS in the IPPS Final Rule and varies slightly between large urban hospitals and other hospitals. For FY2023 the national standardized amount was approximately $6,000. The relative weight (RW) captures the resource intensity of the DRG relative to the average case. A DRG with RW = 1.0 pays exactly the base rate. Higher-RW DRGs cost more to treat and pay proportionally more.
Representative relative weights from the FY2023 MS-DRG grouper illustrate the range:
- DRG 001 — Heart Transplant or Implant of Heart Assist System w/ MCC: RW approximately 25.0—the highest-weight DRG, reflecting weeks of intensive care, complex surgery, and expensive pharmaceuticals. Medicare payment at the national base rate would be roughly $150,000 before other adjustments.
- DRG 870 — Septicemia or Severe Sepsis w/ MV >96 Hours: RW approximately 10.0. Prolonged mechanical ventilation with sepsis is resource-intensive and carries high mortality.
- DRG 470 — Major Joint Replacement of Lower Extremity w/o MCC: RW approximately 2.1. The highest-volume elective DRG in Medicare; hip and knee replacements for patients without major comorbidities. National average Medicare payment approximately $16,000–$18,000 before geographic adjustments.
- DRG 291 — Heart Failure & Shock w/ MCC: RW approximately 1.9. One of the most common medical DRGs.
- DRG 392 — Esophagitis, Gastroenterology & Miscellaneous Digestive Disorders w/o MCC: RW approximately 0.7. Low-complexity GI admissions at below-average resource intensity.
The relative weights are recalibrated every year by CMS using the most recent settled Medicare cost reports. The recalibration is designed to be budget-neutral: CMS adjusts the base rate downward by a factor that offsets the aggregate effect of RW changes, so that total IPPS payments to all hospitals remain constant absent volume and mix changes.
Geographic and hospital-type adjustments
The base rate × RW formula produces a standardized payment that is then modified by a series of adjustments reflecting hospital characteristics and local cost conditions. These adjustments explain a significant fraction of the payment variation observable in the CMS charge data.
Wage index
Labor costs account for roughly 65% of hospital operating expenses, and labor markets vary substantially by geography. CMS adjusts the labor-related share of the base rate by a hospital-specific wage index derived from the Bureau of Labor Statistics survey of hospital wages in each Core Based Statistical Area (CBSA). Hospitals in high-wage CBSAs (San Jose, San Francisco, New York, Boston) receive a wage index well above 1.0, substantially increasing their IPPS payment per discharge. Hospitals in low-wage rural CBSAs may have a wage index below 0.8, reducing their payment. The non-labor-related share of the base rate is not adjusted by the wage index. The net effect is that two hospitals treating the same DRG receive different Medicare payments solely because of where they are located.
Indirect Medical Education add-on
Teaching hospitals incur higher costs because they train resident physicians who perform more tests and procedures (and take longer doing them) than experienced attendings. CMS compensates via an Indirect Medical Education (IME) payment add-on calculated from each hospital's resident-to-bed ratio. The IME formula yields approximately a 5.5% payment increase for each 0.1 increment in the intern and resident to average daily census ratio. A major academic medical center with a resident-to-bed ratio of 0.6 might receive an IME add-on of roughly 33% above the base DRG payment. This is in addition to the Direct Graduate Medical Education (DGME) payments that compensate hospitals for the direct costs of residency programs, which are funded separately from IPPS.
Disproportionate Share Hospital add-on
Hospitals that serve a disproportionate share of low-income Medicare and Medicaid patients receive Disproportionate Share Hospital (DSH) payments on top of their DRG-based reimbursement. The DSH calculation is based on the sum of the SSI (Supplemental Security Income) percentage of Medicare days and the Medicaid percentage of total days. Hospitals with higher proportions of low-income patients receive larger DSH adjustments. The Affordable Care Act restructured DSH payments beginning in FY2014: 25% of the pre-ACA DSH amount is paid as “empirically justified” DSH, while 75% is distributed as Uncompensated Care payments based on each hospital's share of total uncompensated care costs. Safety-net hospitals in urban areas with large Medicaid and uninsured populations are the primary beneficiaries of DSH and uncompensated care payments.
Critical Access Hospitals
Not all hospitals are paid under IPPS. Critical Access Hospitals (CAHs)—small rural hospitals with 25 or fewer acute care beds, located more than 35 miles from the nearest hospital (or more than 15 miles in mountainous terrain or over secondary roads)—are exempt from IPPS entirely. CAHs are instead reimbursed at 101% of their reasonable costs, a cost-based method designed to ensure financial viability for hospitals serving isolated rural communities. Because CAHs do not operate under IPPS, they do not appear in the CMS Inpatient Provider Charge Data.
Outlier payments
No fixed-payment system can anticipate every extremely costly case. IPPS includes an outlier payment mechanism for cases where the hospital's estimated costs exceed the DRG payment by more than a fixed threshold (the “fixed loss threshold”), set annually by CMS. For FY2023 this threshold was approximately $40,000 above the DRG payment. Once a case crosses this threshold, Medicare pays 80% of the incremental cost above it. Outlier payments protect hospitals from catastrophic losses on extraordinarily complex cases while keeping the aggregate outlier pool to approximately 5–6% of total IPPS payments. Because outlier payments depend on billed charges (converted to estimated costs via a cost-to-charge ratio), hospitals with very high chargemasters may more readily trigger outlier qualification.
The charge-to-payment gap
The most striking feature of the CMS charge data is the gap between average covered charges (the hospital's chargemaster price) and average Medicare payments. Across the full dataset, average covered charges routinely run 4x to 10x the Medicare payment for the same DRG at the same hospital. Individual hospitals have charge-to-payment ratios exceeding 15x for certain DRGs. For DRG 470 (Major Joint Replacement), the national median charge-to-payment ratio is approximately 4–5x, but the right tail of the distribution includes hospitals billing $90,000–$120,000 while collecting $15,000–$18,000 from Medicare.
Chargemaster prices are set entirely at the hospital's discretion. Unlike most markets where posted prices bear some relationship to transaction prices, hospital chargemasters are negotiating artifacts—the starting point for contractual discounts with commercial insurers. A commercial insurer negotiating a hospital contract typically receives a percentage-off-charges discount; hospitals with higher chargemasters may therefore extract larger absolute payments for the same services from commercial payers even when the discount percentage is identical. Uninsured patients without negotiating power historically faced chargemaster prices directly, a practice that led to high-profile disputes and the No Surprises Act provisions on hospital price transparency.
For Medicare, chargemaster prices are economically irrelevant except insofar as they affect outlier payment qualification. Medicare pays DRG rates; it does not pay charges. But the charge data published by CMS provides a window into hospital pricing behavior and pricing opacity that no other public dataset offers. The ratio of charges to Medicare payments varies not just by hospital but by DRG within the same hospital, with some service lines marked up more aggressively than others. Surgical DRGs and implant-heavy procedures tend to carry higher markup ratios than medical DRGs because the chargemaster for devices and prosthetics is less standardized and less scrutinized than for pharmacy or routine nursing.
Geographic payment variation
Even after controlling for wage index and other IPPS adjustments, substantial geographic variation in Medicare payments for identical DRGs persists across the CMS dataset. For DRG 470 (Major Joint Replacement), average Medicare payments range from approximately $12,000 at lower-cost hospitals in lower-wage markets to $35,000 or more at high-wage urban teaching hospitals. This is a nearly 3x spread for what is nominally a standardized procedure with well-defined clinical protocols.
Some of this variation is mechanically explained: wage index differences between San Francisco and rural Mississippi account for a substantial portion of the spread. IME adjustments at teaching hospitals add 20%–40% on top of base payments. DSH adjustments at safety-net hospitals add further. But these structural factors explain only part of the geographic variation documented in the Dartmouth Atlas of Health Care, the most comprehensive academic study of Medicare spending variation.
The most famous illustration of unexplained geographic variation is the McAllen–El Paso comparison described by Atul Gawande in his 2009 New Yorker article “The Cost Conundrum.” McAllen, Texas had one of the highest per-capita Medicare spending rates in the country. El Paso, 800 miles away, had nearly identical demographics, poverty rates, and disease burden—but Medicare spending per beneficiary was roughly half. The same gap appeared across dozens of paired communities: Rochester, Minnesota (Mayo Clinic) against Miami, Florida; Grand Junction, Colorado against Los Angeles. High-spending regions were not producing better outcomes; in many cases, lower-spending regions with more evidence-based care patterns had better quality metrics.
The Dartmouth researchers attributed the variation primarily to differences in the supply of hospital beds and specialist physicians (which drives volume of services regardless of clinical need), local care culture, and the degree to which hospitals and physicians operated under evidence-based protocols versus fee-for-service volume incentives. The CMS inpatient charge data, when combined with beneficiary population data from CMS chronic conditions files or the Medicare claims research files, allows replication of this type of analysis at the DRG level.
Value-based care payment adjustments
Since the passage of the Affordable Care Act in 2010, CMS has layered three additional payment adjustment programs onto the IPPS base that modify final payments up or down based on quality performance rather than case volume or complexity.
Hospital Value-Based Purchasing
The Hospital Value-Based Purchasing (HVBP) program withholds 2% of each hospital's Medicare IPPS payments into a pool and then redistributes the pool based on quality scores across four domains: clinical outcomes, safety, efficiency and cost reduction, and patient experience (HCAHPS survey). Hospitals scoring above the national mean receive back more than their 2% withhold; hospitals below the mean receive less. The best performers receive a bonus of roughly 2.5–3.0% above baseline; the worst performers receive essentially nothing back from their withhold. For a hospital receiving $200 million in annual IPPS payments, the difference between the best and worst HVBP outcomes can be $8–10 million per year.
Hospital Readmissions Reduction Program
The Hospital Readmissions Reduction Program (HRRP) penalizes hospitals with excess 30-day readmission rates relative to a national risk-adjusted benchmark. CMS tracks readmissions for six conditions: acute myocardial infarction (AMI), heart failure (HF), pneumonia, chronic obstructive pulmonary disease (COPD), coronary artery bypass graft (CABG) surgery, and total knee and hip arthroplasty (TKA/THA—DRG 470 and its bilateral equivalents). Hospitals with excess readmissions receive a payment reduction of up to 3% applied across all Medicare IPPS admissions, not just the conditions that triggered the penalty. The program has been controversial: evidence suggests it may have contributed to a reduction in heart failure readmissions but also may have incentivized hospitals to keep patients in “observation status” rather than formal inpatient admission, which does not count as an inpatient readmission but shifts costs to patients because observation status is covered under Part B rather than the Part A inpatient benefit.
Hospital-Acquired Condition Reduction Program
The Hospital-Acquired Condition Reduction Program (HACRP) imposes a 1% payment reduction on the 25% of hospitals with the worst hospital-acquired condition (HAC) performance scores. The HAC score aggregates measures of central line-associated bloodstream infections (CLABSI), catheter-associated urinary tract infections (CAUTI), surgical site infections, Clostridioides difficile infections, methicillin-resistant Staphylococcus aureus infections, and falls. HACRP is a binary penalty: hospitals in the worst quartile lose 1% of IPPS payments; all others are unaffected. Unlike HVBP, there is no upside potential—only penalty avoidance. The program creates an incentive for hospitals to invest in infection control and patient safety measures that reduce HAC rates, though critics note that some HACs are difficult to avoid regardless of care quality and that risk adjustment may be inadequate for hospitals treating the most complex patients.
Dataset structure and access
The CMS Inpatient Provider Charge Data is available at data.cms.gov/provider-data under the “Hospital datasets” section. Two variants are published: the “Top 100 DRGs” file covering only the 100 highest-volume DRGs nationally, and the full DRG file covering all reported DRG codes. Both are available as downloadable CSV and via the Socrata API.
The dataset schema for each row includes:
- DRG Definition: the MS-DRG code and description (e.g., “470 - MAJOR JOINT REPLACEMENT OR REATTACHMENT OF LOWER EXTREMITY W/O MCC”)
- Provider Id: the CMS Certification Number (CCN), a six-digit identifier for the hospital
- Provider Name
- Provider Street Address
- Provider City
- Provider State
- Provider Zip Code
- Hospital Referral Region Description: the Dartmouth Atlas HRR label (state abbreviation and reference city, e.g., “TX - Dallas”)
- Total Discharges: count of Medicare IPPS discharges for that DRG at that hospital during the year
- Average Covered Charges: mean chargemaster price billed by the hospital
- Average Total Payments: mean total payment from all sources (Medicare plus secondary insurance plus patient cost-sharing)
- Average Medicare Payments: mean net payment from Medicare alone
The Socrata API at data.cms.gov/resource/ supports field filtering, row-level query, and JSON or CSV output. No API key is required for read access at reasonable query volumes. The bulk CSV downloads are the most efficient path for full-dataset analysis; the API is best for targeted queries by DRG, state, or provider. Historical releases are archived on the CMS website and the NBER IRIS repository maintains hospital-level cost report data that can be joined to the charge data for deeper cost analysis.
Python example: DRG 470 charge-to-payment analysis
The following script queries the CMS Socrata API for DRG 470 (Major Joint Replacement of Lower Extremity w/o MCC), computes the charge-to-payment ratio for each hospital, identifies the highest- and lowest-ratio providers, and computes discharge-weighted average Medicare payments by state. No API key is required.
import requests
import pandas as pd
import io
# ---------------------------------------------------------------------------
# CMS Medicare Inpatient Provider Charge Data
# Source: data.cms.gov Socrata API (no API key required for read access)
# Dataset: "Inpatient Prospective Payment System (IPPS) Provider Summary"
# Socrata endpoint for the most recent year (FY2023 as of 2025):
# ---------------------------------------------------------------------------
SOCRATA_BASE = "https://data.cms.gov/resource"
# FY2023 full DRG dataset (all DRGs, not just top 100)
# Socrata dataset ID may change with new annual releases; verify at data.cms.gov
DATASET_ID = "7pny-wwx7" # FY2023 IPPS Provider Summary for All DRGs
def fetch_drg_data(drg_code: str, limit: int = 50000) -> pd.DataFrame:
"""
Fetch all hospital records for a specific DRG from the CMS Socrata API.
drg_code example: "470" for Major Joint Replacement of Lower Extremity
"""
url = f"{SOCRATA_BASE}/{DATASET_ID}.csv"
params = {
"$where": f"drg_definition LIKE '{drg_code} - %'",
"$limit": limit,
}
resp = requests.get(url, params=params, timeout=120)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
return df
def load_full_dataset(limit: int = 500000) -> pd.DataFrame:
"""
Load the full IPPS dataset from the Socrata API.
At ~500k rows this will take 30-60 seconds depending on bandwidth.
Alternative: download the bulk CSV directly from data.cms.gov/provider-data.
"""
url = f"{SOCRATA_BASE}/{DATASET_ID}.csv"
params = {"$limit": limit}
resp = requests.get(url, params=params, timeout=300)
resp.raise_for_status()
df = pd.read_csv(io.StringIO(resp.text), dtype=str, low_memory=False)
print(f"Loaded {len(df):,} rows, {df['drg_definition'].nunique()} unique DRGs, "
f"{df['provider_id'].nunique()} unique providers")
return df
def clean_numeric_cols(df: pd.DataFrame) -> pd.DataFrame:
"""Strip dollar signs and commas; convert payment columns to float."""
money_cols = [
"average_covered_charges",
"average_total_payments",
"average_medicare_payments",
"total_discharges",
]
for col in money_cols:
if col in df.columns:
df[col] = (
df[col]
.astype(str)
.str.replace(r"[$,]", "", regex=True)
.pipe(pd.to_numeric, errors="coerce")
)
return df
def analyze_drg_470(df: pd.DataFrame) -> None:
"""
Analyze DRG 470: Major Joint Replacement of Lower Extremity w/o MCC.
Compute charge-to-payment ratio; identify high and low outliers; state averages.
"""
mask = df["drg_definition"].astype(str).str.startswith("470 -")
drg470 = df[mask].copy()
drg470 = clean_numeric_cols(drg470)
# Drop rows with missing payment data
drg470 = drg470.dropna(subset=["average_medicare_payments", "average_covered_charges"])
drg470 = drg470[drg470["average_medicare_payments"] > 0]
# Charge-to-payment ratio: how many dollars of chargemaster price per dollar paid
drg470["charge_to_payment_ratio"] = (
drg470["average_covered_charges"] / drg470["average_medicare_payments"]
)
total = len(drg470)
print(f"DRG 470 records: {total:,}")
print()
# National summary statistics
print("--- National DRG 470 Summary ---")
print(f"Median Medicare payment: ${drg470['average_medicare_payments'].median():>10,.0f}")
print(f"Mean Medicare payment: ${drg470['average_medicare_payments'].mean():>10,.0f}")
print(f"Min Medicare payment: ${drg470['average_medicare_payments'].min():>10,.0f}")
print(f"Max Medicare payment: ${drg470['average_medicare_payments'].max():>10,.0f}")
print(f"Median covered charges: ${drg470['average_covered_charges'].median():>10,.0f}")
print(f"Median charge/payment ratio: {drg470['charge_to_payment_ratio'].median():>10.1f}x")
print()
# Top 10 highest charge-to-payment ratio hospitals
top_ratio = drg470.nlargest(10, "charge_to_payment_ratio")[
["provider_name", "provider_state", "average_covered_charges",
"average_medicare_payments", "charge_to_payment_ratio"]
]
print("--- Highest Charge-to-Payment Ratio Hospitals (DRG 470) ---")
for _, row in top_ratio.iterrows():
print(
f" {row['provider_name'][:45]:<45} {row['provider_state']} "
f"charges=${row['average_covered_charges']:>8,.0f} "
f"paid=${row['average_medicare_payments']:>7,.0f} "
f"ratio={row['charge_to_payment_ratio']:.1f}x"
)
print()
# Bottom 10 by Medicare payment (lowest-cost providers)
lowest_pay = drg470.nsmallest(10, "average_medicare_payments")[
["provider_name", "provider_state", "average_medicare_payments",
"total_discharges"]
]
print("--- Lowest Average Medicare Payment Hospitals (DRG 470) ---")
for _, row in lowest_pay.iterrows():
print(
f" {row['provider_name'][:45]:<45} {row['provider_state']} "
f"avg_payment=${row['average_medicare_payments']:>8,.0f} "
f"discharges={row['total_discharges']:.0f}"
)
print()
# State-level average Medicare payment (weighted by discharges)
state_summary = (
drg470
.groupby("provider_state")
.apply(lambda g: pd.Series({
"hospitals": len(g),
"total_discharges": g["total_discharges"].sum(),
"weighted_avg_payment": (
(g["average_medicare_payments"] * g["total_discharges"]).sum()
/ g["total_discharges"].sum()
),
}))
.sort_values("weighted_avg_payment", ascending=False)
)
print("--- State-Level Weighted Average Medicare Payment, DRG 470 ---")
print(f"{'State':<6} {'Hospitals':>9} {'Discharges':>11} {'Wtd Avg Payment':>17}")
print("-" * 46)
for state, row in state_summary.iterrows():
print(
f"{state:<6} {row['hospitals']:>9.0f} {row['total_discharges']:>11,.0f}"
f" ${row['weighted_avg_payment']:>15,.0f}"
)
def main() -> None:
print("Fetching CMS IPPS FY2023 data for DRG 470...")
df = fetch_drg_data("470")
if df.empty:
print("No rows returned for DRG 470. Loading full dataset as fallback...")
df = load_full_dataset()
analyze_drg_470(df)
if __name__ == "__main__":
main()
The charge-to-payment ratio computed above is the primary diagnostic for chargemaster opacity. A ratio of 5.0x means the hospital bills $5 for every $1 Medicare pays. This is not fraud—it is the standard structure of US hospital pricing—but it illustrates why the uninsured and out-of-network patients face catastrophic bills that bear no relationship to Medicare's actual negotiated rates. The state-level weighted averages highlight the geographic dimension: California, New York, and New Jersey typically show the highest average Medicare payments for DRG 470 due to wage index effects, while states in the Mountain West and South show lower payments. Overlaying HRRP penalty data from the HVBP and HRRP datasets (also available on data.cms.gov) allows analysts to test whether hospitals with high payment variation also have worse readmission or quality outcomes.
For the broader federal fiscal context behind Medicare spending, see Treasury Daily Treasury Statement: The Federal Cash Flow Data Published Every Business Day, which covers how Medicare and Social Security outlays appear in the daily TGA transaction data and tracks total federal health spending alongside other program categories.
For interest rate context relevant to hospital capital financing and Medicare trust fund investment returns, see Federal Reserve H.15: The Selected Interest Rates Release Behind Treasury Yields, Fed Funds, and Every Rate Benchmark, covering the Treasury yield curve, SOFR, and the rate environment that shapes hospital bond issuance and Medicare Part A trust fund projections.