Medicaid Provider Spending:
$530 Billion Under the Microscope
Independent analysis of 128 million Medicaid provider spending records (2019–2022). Using statistical fraud detection methods including z-score composites, Pareto analysis, and billing pattern anomalies to identify where taxpayer dollars may be at risk.
Total Paid
$530B
2019–2022
Total Claims
3.5B
Individual claims
Providers
739K
Unique billing NPIs
Median Claim
$311
Mean: $4,144
Extreme Spending Concentration
Medicaid spending follows a classic Pareto pattern — a tiny fraction of providers account for the overwhelming majority of payments. This extreme concentration means that fraud or waste among even a small number of top providers can represent billions in potential savings.
Spending Concentration (Pareto Curve)
Just 7,395 providers (top 1%) account for 52.3% of all Medicaid spending. The remaining 99% of providers share less than half.
P25
$42
P50
$311
P75
$1.8K
P95
$14.2K
P99
$27.1K
Key insight
The median claim is just $311, but the mean is $4,144 — a 13x difference driven by extreme outliers. The 99th percentile claim ($27,134) is 87x the median, and the maximum single claim exceeds $143 million.
Where the Money Goes: Top Procedure Codes
Top 10 HCPCS Codes by Total Spending
| Code | Description | Total ($B) | Avg/Claim | Providers |
|---|---|---|---|---|
| T1015 | Community-based waiver services | $38.4B | $176.96 | 20,877 |
| T2025 | Non-emergency transport - van | $17.3B | $128.97 | 3,253 |
| 99214 | Office visit - established patient | $14.8B | $74.17 | 210,435 |
| T1019 | Personal care services | $12.7B | $64.35 | 14,221 |
| 99213 | Office visit - lower complexity | $10.7B | $48.33 | 221,678 |
| S5125 | Attendant care services | $8.1B | $92.44 | 8,923 |
| T2003 | Non-emergency transport - taxi | $7.9B | $224.55 | 4,112 |
| H0015 | Alcohol/drug treatment - intensive | $6.8B | $83.14 | 15,766 |
| 99215 | Office visit - high complexity | $6.2B | $128.35 | 147,332 |
| J3490 | Unclassified drugs | $5.8B | $1,847.12 | 42,891 |
Most Expensive Procedures (Avg Cost per Claim)
3,142,000 claims (min 1K)
287,000 claims (min 1K)
412,000 claims (min 1K)
891,000 claims (min 1K)
156,000 claims (min 1K)
Notable
Code J3490 ("Unclassified drugs") is both a top-10 spending code ($5.8B total) AND the most expensive per claim ($1,847 avg). This catch-all code lacks specific drug identification, making it a known vector for billing abuse.
Multi-Dimensional Fraud Detection
We built a composite fraud scoring system that evaluates each provider across 4 statistical dimensions using z-scores. Providers with multiple high z-scores receive the highest risk ratings.
Fraud Detection Dimensions
Overpriced Procedures
Charging >3σ above peer mean for same HCPCS code
flagged
High Claim Volume
Claims per beneficiary above 99th percentile
flagged
NPI Mismatch
>50% claims billed by different entity (≥500 claims)
flagged
Low Code Diversity
≥$1M spending on ≤5 HCPCS codes
flagged
Monthly Spikes
Spending >3σ above provider's own monthly baseline
flagged
Providers Above 99th Percentile
4,891
Composite fraud score > 2.47
Combined Spending of Flagged
$28.5B
5.4% of total Medicaid spending
Top 5 Highest-Risk Providers (Anonymized)
| # | NPI | Total Paid | Claims | Codes | Mismatch% | Score |
|---|---|---|---|---|---|---|
| 1 | 1922xxxxx4 | $847M | 12.4M | 3 | 98.2% | 8.74 |
| 2 | 1134xxxxx7 | $623M | 8.9M | 2 | 95.1% | 7.91 |
| 3 | 1467xxxxx2 | $512M | 7.2M | 4 | 87.3% | 7.23 |
| 4 | 1891xxxxx8 | $398M | 5.6M | 1 | 99.7% | 6.88 |
| 5 | 1253xxxxx1 | $345M | 4.3M | 5 | 72.4% | 6.54 |
NPI numbers partially redacted. Scores represent composite z-score averages across overpricing, claim volume, NPI mismatch, and code diversity dimensions.
Pattern
The highest-risk providers share a striking profile: very few HCPCS codes (1-5), near-total billing/servicing NPI mismatch (>90%), and hundreds of millions in total payments. This combination is a classic signature of organized billing schemes.
The Billing/Servicing NPI Gap
Each Medicaid claim has two NPI numbers: the billing provider (who submits the claim) and the servicing provider (who performs the service). When these differ persistently, it may indicate legitimate group practice billing — or it may signal shell companies, kickback arrangements, or identity fraud.
Same NPI
58.5%
$310B
Different NPI
41.5%
$220B
Spending Split
High-Risk Mismatch Providers
8,234
Providers with >50% mismatch (≥500 claims)
$12.4B
Total spending by high-mismatch providers
Context
Not all billing/servicing mismatches are fraudulent — group practices, hospital systems, and management companies legitimately bill on behalf of individual providers. However, when combined with other risk indicators, persistent NPI mismatch becomes a strong fraud signal.
Key Findings & Recommendations
Concentration risk
52.3%
of total spending is controlled by just 1% of providers. Auditing the top 7,395 providers could cover over half of all Medicaid expenditure.
Shell billing pattern
$12.4B
flows through high-mismatch billing entities — providers where >50% of claims are billed through a different NPI than the service provider.
J3490
$5.8B
billed under "unclassified drugs" — a catch-all code that bypasses specific drug identification and automated price checks.
Composite detection
$28.5B
in spending concentrated among ~4,900 providers flagged by the multi-dimensional fraud scoring system (top 1% by composite z-score).
Bottom line
Targeted auditing of fewer than 5,000 providers (less than 0.7% of total) could address up to $28.5 billion in potentially anomalous spending. The combination of extreme spending concentration, billing entity opacity, and procedure code ambiguity creates structural conditions that enable waste and fraud — regardless of intent.
Methodology Note
This analysis uses publicly available Medicaid provider spending data in Parquet format, queried with DuckDB. The dataset contains 127,932,773 records spanning 48 months (2019-01 to 2022-12), covering 739,498 unique billing providers and 7,702 HCPCS procedure codes.
Provider profiling
489,112 providers with ≥100 claims were profiled across: avg paid per claim, claims per beneficiary, % billing mismatch, and HCPCS code diversity.
Z-scores
Each dimension was standardized using z-scores (standard deviations from mean). Only positive z-scores (above-average anomaly) contribute to the composite.
Composite score
Average of clipped z-scores across all 4 dimensions. Higher = more anomalous across more dimensions simultaneously.
Peer comparison
Overpriced procedure detection compares each provider's avg cost per HCPCS code against the peer mean for that code (requiring ≥50 claims and ≥10 peer providers).
Limitations
Statistical flags ≠ fraud
Anomalous patterns may have legitimate explanations (specialty providers, group billing structures, high-acuity patient populations).
No clinical context
The dataset lacks diagnosis codes, patient demographics, and clinical justification — all necessary for definitive fraud determination.
Aggregated data
Records are pre-aggregated by provider, code, and month — individual claim-level detail is not available.
This analysis is for informational and research purposes only. Statistical anomalies identified here should not be interpreted as evidence of fraud without further investigation. Data source: CMS Medicaid Provider Utilization and Payment Data.