[Multiple imputation and complete case analysis in logistic regression models: a practical assessment of the impact of incomplete covariate data]

Cad Saude Publica. 2011 Dec;27(12):2299-313. doi: 10.1590/s0102-311x2011001200003.
[Article in Portuguese]

Abstract

Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.

Publication types

  • English Abstract

MeSH terms

  • Adolescent
  • Adult
  • Body Mass Index*
  • Brazil
  • Data Interpretation, Statistical*
  • Databases, Factual
  • Female
  • Health Surveys / methods*
  • Humans
  • Logistic Models
  • Male
  • Middle Aged
  • Regression Analysis*
  • Sex Factors
  • Young Adult