Development and performance characteristics of novel code-based algorithms to identify invasive Escherichia coli disease

Pharmacoepidemiol Drug Saf. 2022 Sep;31(9):983-991. doi: 10.1002/pds.5505. Epub 2022 Jul 10.

Abstract

Purpose: Evaluation of novel code-based algorithms to identify invasive Escherichia coli disease (IED) among patients in healthcare databases.

Methods: Inpatient visits with microbiological evidence of invasive bacterial disease were extracted from the Optum© electronic health record database between January 1, 2016 and June 30, 2020. Six algorithms, derived from diagnosis and drug exposure codes associated to infectious diseases and Escherichia coli, were developed to identify IED. The performance characteristics of algorithms were assessed using a reference standard derived from microbiology data.

Results: Among 97 194 eligible records, 25 310 (26.0%) were classified as IED. Algorithm 1 (diagnosis code for infectious invasive disease due to E. coli) had the highest positive predictive value (PPV; 96.0%) and lowest sensitivity (60.4%). Algorithm 2, which additionally included patients with diagnosis codes for infectious invasive disease due to an unspecified organism, had the highest sensitivity (95.5%) and lowest PPV (27.8%). Algorithm 4, which required patients with a diagnosis code for infectious invasive disease due to unspecified organism to have no diagnosis code for non-E. coli infections, achieved the most balanced performance characteristics (PPV, 93.6%; sensitivity, 78.1%; F1 score, 85.1%). Finally, adding exposure to antibiotics in the treatment of E. coli had limited impact on performance algorithms 5 and 6.

Conclusion: Algorithm 4, which achieved the most balanced performance characteristics, offers a useful tool to identify patients with IED and assess the burden of IED in healthcare databases.

Keywords: Escherichia coli; code-based algorithms; electronic health record database; performance characteristics; phenotype; sepsis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Databases, Factual
  • Electronic Health Records*
  • Escherichia coli
  • Humans
  • International Classification of Diseases
  • Predictive Value of Tests