A Warning About Using Predicted Values From Regression Models for Epidemiologic Inquiry

Am J Epidemiol. 2021 Jun 1;190(6):1142-1147. doi: 10.1093/aje/kwaa282.

Abstract

In many settings, researchers may not have direct access to data on 1 or more variables needed for an analysis and instead may use regression-based estimates of those variables. Using such estimates in place of original data, however, introduces complications and can result in uninterpretable analyses. In simulations and observational data, we illustrate the issues that arise when an average treatment effect is estimated from data where the outcome of interest is predicted from an auxiliary model. We show that bias in any direction can result, under both the null and alternative hypotheses.

Keywords: imputation; measurement error; proxy variables.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bias
  • Data Interpretation, Statistical*
  • Epidemiologic Studies*
  • Forecasting
  • Humans
  • Models, Statistical*
  • Regression Analysis*