Approaches to addressing missing values, measurement error, and confounding in epidemiologic studies

J Clin Epidemiol. 2021 Mar:131:89-100. doi: 10.1016/j.jclinepi.2020.11.006. Epub 2020 Nov 8.

Abstract

Objectives: Epidemiologic studies often suffer from incomplete data, measurement error (or misclassification), and confounding. Each of these can cause bias and imprecision in estimates of exposure-outcome relations. We describe and compare statistical approaches that aim to control all three sources of bias simultaneously.

Study design and setting: We illustrate four statistical approaches that address all three sources of bias, namely, multiple imputation for missing data and measurement error, multiple imputation combined with regression calibration, full information maximum likelihood within a structural equation modeling framework, and a Bayesian model. In a simulation study, we assess the performance of the four approaches compared with more commonly used approaches that do not account for measurement error, missing values, or confounding.

Results: The results demonstrate that the four approaches consistently outperform the alternative approaches on all performance metrics (bias, mean squared error, and confidence interval coverage). Even in simulated data of 100 subjects, these approaches perform well.

Conclusion: There can be a large benefit of addressing measurement error, missing values, and confounding to improve the estimation of exposure-outcome relations, even when the available sample size is relatively small.

Keywords: Confounding; Data analysis; Imputation; Measurement error; Missing data; Regression; Regression calibration; Simulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Bias
  • Computer Simulation
  • Confounding Factors, Epidemiologic
  • Data Interpretation, Statistical*
  • Epidemiologic Studies*
  • Humans
  • Probability