Comparison of Imputation Strategies for Incomplete Longitudinal Data in Life-Course Epidemiology

Am J Epidemiol. 2023 Nov 10;192(12):2075-2084. doi: 10.1093/aje/kwad139.

Abstract

Incomplete longitudinal data are common in life-course epidemiology and may induce bias leading to incorrect inference. Multiple imputation (MI) is increasingly preferred for handling missing data, but few studies explore MI-method performance and feasibility in real-data settings. We compared 3 MI methods using real data under 9 missing-data scenarios, representing combinations of 10%, 20%, and 30% missingness and missing completely at random, at random, and not at random. Using data from Health and Retirement Study (HRS) participants, we introduced record-level missingness to a sample of participants with complete data on depressive symptoms (1998-2008), mortality (2008-2018), and relevant covariates. We then imputed missing data using 3 MI methods (normal linear regression, predictive mean matching, variable-tailored specification), and fitted Cox proportional hazards models to estimate effects of 4 operationalizations of longitudinal depressive symptoms on mortality. We compared bias in hazard ratios, root mean square error, and computation time for each method. Bias was similar across MI methods, and results were consistent across operationalizations of the longitudinal exposure variable. However, our results suggest that predictive mean matching may be an appealing strategy for imputing life-course exposure data, given consistently low root mean square error, competitive computation times, and few implementation challenges.

Keywords: Health and Retirement Study; fully conditional specification; joint modeling; longitudinal data; missing not at random; multiple imputation; multiple imputation by chained equations; predictive mean matching.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Computer Simulation
  • Data Interpretation, Statistical
  • Humans
  • Linear Models
  • Proportional Hazards Models
  • Research Design*