A practical guide to multiple imputation of missing data in nephrology

Kidney Int. 2021 Jan;99(1):68-74. doi: 10.1016/j.kint.2020.07.035. Epub 2020 Aug 18.

Abstract

Health data are often plagued with missing values that can greatly reduce the sample size if only complete cases are considered for analysis. Furthermore, analyses that ignore missing data have the potential to introduce bias in the parameter estimates. Multiple imputation techniques have been developed to recover the information that would otherwise be lost when excluding observations with missing data and to help minimize bias. However, the validity of analyses using imputed data relies on the imputation model having been correctly specified. The aim of this guide is to aid the reader in the decision-making process when conducting an analysis with multiply imputed data in the context of nephrology research. We discuss (i) missing mechanism assumption, (ii) imputation method, (iii) imputation model, (iv) derived variables, (v) the number of imputed data sets, (vi) diagnostic checks, (vii) analysis and pooling of results, and (viii) reporting the results. This process is demonstrated using data from the National Health and Nutrition Examination Survey to explore the association between hypertension and kidney disease in adults from the general population. Example code is provided for SAS software and the mice package in R.

Keywords: guide; missing data; multiple imputation.

Publication types

  • Review

MeSH terms

  • Adult
  • Animals
  • Bias
  • Data Interpretation, Statistical
  • Humans
  • Mice
  • Nephrology*
  • Nutrition Surveys
  • Sample Size