Handling missing covariate data in clinical studies in haematology

Best Pract Res Clin Haematol. 2023 Jun;36(2):101477. doi: 10.1016/j.beha.2023.101477. Epub 2023 May 22.

Abstract

Missing data are frequently encountered across studies in clinical haematology. Failure to handle these missing values in an appropriate manner can complicate the interpretation of a study's findings, as estimates presented may be biased and/or imprecise. In the present work, we first provide an overview of current methods for handling missing covariate data, along with their advantages and disadvantages. Furthermore, a systematic review is presented, exploring both contemporary reporting of missing values in major haematological journals, and the methods used for handling them. A principal finding was that the method of handling missing data was explicitly specified in a minority of articles (in 76 out of 195 articles reporting missing values, 39%). Among these, complete case analysis and the missing indicator method were the most common approaches to dealing with missing values, with more complex methods such as multiple imputation being extremely rare (in 7 out of 195 articles). An example analysis (with associated code) is also provided using hematopoietic stem cell transplantation data, illustrating the different approaches to handling missing values. We conclude with various recommendations regarding the reporting and handling of missing values for future studies in clinical haematology.

Keywords: Complete case analysis; Missing covariates; Missing data; Missing indicator method; Multiple imputation.

Publication types

  • Systematic Review
  • Review

MeSH terms

  • Data Interpretation, Statistical
  • Hematology*
  • Humans
  • Research Design