Observability and its impact on differential bias for clinical prediction models

J Am Med Inform Assoc. 2022 Apr 13;29(5):937-943. doi: 10.1093/jamia/ocac019.

Abstract

Objective: Electronic health records have incomplete capture of patient outcomes. We consider the case when observability is differential across a predictor. Including such a predictor (sensitive variable) can lead to algorithmic bias, potentially exacerbating health inequities.

Materials and methods: We define bias for a clinical prediction model (CPM) as the difference between the true and estimated risk, and differential bias as bias that differs across a sensitive variable. We illustrate the genesis of differential bias via a 2-stage process, where conditional on having the outcome of interest, the outcome is differentially observed. We use simulations and a real-data example to demonstrate the possible impact of including a sensitive variable in a CPM.

Results: If there is differential observability based on a sensitive variable, including it in a CPM can induce differential bias. However, if the sensitive variable impacts the outcome but not observability, it is better to include it. When a sensitive variable impacts both observability and the outcome no simple recommendation can be provided. We show that one cannot use observed data to detect differential bias.

Discussion: Our study furthers the literature on observability, showing that differential observability can lead to algorithmic bias. This highlights the importance of considering whether to include sensitive variables in CPMs.

Conclusion: Including a sensitive variable in a CPM depends on whether it truly affects the outcome or just the observability of the outcome. Since this cannot be distinguished with observed data, observability is an implicit assumption of CPMs.

Keywords: algorithmic bias; clinical prediction models; electronic health record; health equity; observability.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Humans
  • Models, Statistical*
  • Prognosis