About This Project
Data Source
This dashboard visualizes the HHS Medicaid Provider Spending Dataset, published by the U.S. Department of Health and Human Services on the Medicaid Open Data Portal.
The dataset contains 227 million rows of claim-level spending data from 2018 through 2024, covering $1.09 trillion in Medicaid payments to over 617,000 unique billing providers across all 50 states, DC, and U.S. territories.
Methodology
Raw claims data was enriched with provider names and locations from the NPPES (National Plan and Provider Enumeration System), procedure descriptions from HCPCS (Healthcare Common Procedure Coding System), and provider classifications from NUCC (National Uniform Claim Committee) taxonomy codes.
All aggregations were computed using DuckDB against the raw Parquet files. The web application serves pre-computed JSON files — no real-time queries are executed in the browser.
- Spending figures are as reported by HHS and have not been adjusted for inflation.
- Provider locations come from NPPES registration data, not claim service locations.
- State-level aggregations use the billing provider's registered state, which may differ from where services were rendered.
- Year-over-year growth is calculated using calendar year totals.
Privacy
The source dataset already applies privacy protections: rows with fewer than 12 claims are excluded by HHS before publication. All data shown here is derived from publicly available government records.
Limitations
- This dataset covers Medicaid fee-for-service claims only. Managed care (which accounts for a significant portion of Medicaid spending) is not included.
- Provider names may contain typos or variations (e.g., LLC vs L.L.C.) since they come from NPPES self-reported data.
- Some NPI numbers may map to different entities over time due to practice changes, mergers, or data entry issues.
- The dataset is updated annually. The most recent data may be incomplete due to claims processing lag.
Technology
Built with Next.js (static export), Tailwind CSS, Recharts, and react-simple-maps. Data processing uses Python with DuckDB. Hosted on Vercel with zero server costs.