Identifying High-Risk Patients without Labeled Training Data: Anomaly Detection Methodologies to Predict Adverse Outcomes

Zeeshan Syed; Mohammed Saeed; Ilan Rubinfeld

Identifying High-Risk Patients without Labeled Training Data: Anomaly Detection Methodologies to Predict Adverse Outcomes

AMIA Annu Symp Proc. 2010 Nov 13:2010:772-6.

Authors

Zeeshan Syed¹, Mohammed Saeed, Ilan Rubinfeld

Affiliation

¹ University of Michigan, Ann Arbor, MI;

PMID: 21347083
PMCID: PMC3041411

Abstract

For many clinical conditions, only a small number of patients experience adverse outcomes. Developing risk stratification algorithms for these conditions typically requires collecting large volumes of data to capture enough positive and negative for training. This process is slow, expensive, and may not be appropriate for new phenomena. In this paper, we explore different anomaly detection approaches to identify high-risk patients as cases that lie in sparse regions of the feature space. We study three broad categories of anomaly detection methods: classification-based, nearest neighbor-based, and clustering-based techniques. When evaluated on data from the National Surgical Quality Improvement Program (NSQIP), these methods were able to successfully identify patients at an elevated risk of mortality and rare morbidities following inpatient surgical procedures.

MeSH terms

Humans
Inpatients
Postoperative Complications*
Quality Improvement*
Risk
United States