Propensity scores using missingness pattern information: a practical guide

Stat Med. 2020 May 20;39(11):1641-1657. doi: 10.1002/sim.8503. Epub 2020 Feb 27.

Abstract

Electronic health records are a valuable data source for investigating health-related questions, and propensity score analysis has become an increasingly popular approach to address confounding bias in such investigations. However, because electronic health records are typically routinely recorded as part of standard clinical care, there are often missing values, particularly for potential confounders. In our motivating study-using electronic health records to investigate the effect of renin-angiotensin system blockers on the risk of acute kidney injury-two key confounders, ethnicity and chronic kidney disease stage, have 59% and 53% missing data, respectively. The missingness pattern approach (MPA), a variant of the missing indicator approach, has been proposed as a method for handling partially observed confounders in propensity score analysis. In the MPA, propensity scores are estimated separately for each missingness pattern present in the data. Although the assumptions underlying the validity of the MPA are stated in the literature, it can be difficult in practice to assess their plausibility. In this article, we explore the MPA's underlying assumptions by using causal diagrams to assess their plausibility in a range of simple scenarios, drawing general conclusions about situations in which they are likely to be violated. We present a framework providing practical guidance for assessing whether the MPA's assumptions are plausible in a particular setting and thus deciding when the MPA is appropriate. We apply our framework to our motivating study, showing that the MPA's underlying assumptions appear reasonable, and we demonstrate the application of MPA to this study.

Keywords: electronic health records; missing confounder data; missing indicator; missingness pattern; propensity score analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias
  • Causality
  • Models, Statistical*
  • Propensity Score
  • Research Design*