Residual Confounding in Health Plan Performance Assessments: Evidence From Randomization in Medicaid

Jacob Wallace; J Michael McWilliams; Anthony Lollo; Janet Eaton; Chima D Ndumele

doi:10.7326/M21-0881

Residual Confounding in Health Plan Performance Assessments: Evidence From Randomization in Medicaid

Ann Intern Med. 2022 Mar;175(3):314-324. doi: 10.7326/M21-0881. Epub 2022 Jan 4.

Authors

Jacob Wallace¹, J Michael McWilliams², Anthony Lollo¹, Janet Eaton³, Chima D Ndumele¹

Affiliations

¹ Yale School of Public Health, New Haven, Connecticut (J.W., A.L., C.D.N.).
² Department of Health Care Policy, Harvard Medical School, and Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, Massachusetts (J.M.M.).
³ Yale School of Public Health, and Tobin Center for Economic Policy, Yale University, New Haven, Connecticut (J.E.).

PMID: 34978862
DOI: 10.7326/M21-0881

Abstract

Background: Risk adjustment is used widely in payment systems and performance assessments, but the extent to which it distinguishes plan or provider effects from confounding due to patient differences is typically unknown.

Objective: To assess the degree to which risk-adjusted measures of health plan performance adequately adjust for the variation across plans that arises because of differences in patient characteristics (residual confounding).

Design: Comparison between plan performance estimates based on enrollees who made plan choices (observational population) and estimates based on enrollees assigned to plans (randomized population).

Setting: Natural experiment in which more than two thirds of a state's Medicaid population in 1 region was randomly assigned to 1 of 5 plans.

Participants: 137 933 enrollees in 2013 to 2014, of whom 31.1% selected a plan and 68.9% were randomly assigned to 1 of the same 5 plans.

Measurements: Annual total spending (that is, payments to providers), primary care use, dental care use, and avoidable emergency department visits, all scored as plan-specific deviations from the "average" plan performance within each population.

Results: Enrollee characteristics were appreciably imbalanced across plans in the observational population, as expected, but were not in the randomized population. Annual total spending varied across plans more in the observational population (SD, $147 per enrollee) than in the randomized population (SD, $70 per enrollee) after accounting for baseline differences in the observational and randomized populations and for differences across plans. On average, a plan's spending score (its deviation from the "average" performance) in the observational population differed from its score in the randomized population by $67 per enrollee in absolute value (95% CI, $38 to $123), or 4.2% of mean spending per enrollee (P = 0.009, rejecting the null hypothesis that this difference would be expected from sampling error). The difference was reduced modestly by risk adjustment to $62 per enrollee (P = 0.012). Residual confounding was similarly substantial for most other performance measures. Further adjustment for social factors did not materially change estimates.

Limitation: Potential heterogeneity in plan effects between the 2 populations.

Conclusion: Residual confounding in risk-adjusted performance assessments can be substantial and should caution policymakers against assuming that risk adjustment isolates real differences in plan performance.

Primary funding source: Arnold Ventures.

Publication types

Randomized Controlled Trial
Research Support, Non-U.S. Gov't

MeSH terms

Humans
Medicaid*
Random Allocation
United States