Validation of a Machine Learning Model That Outperforms Clinical Risk Scoring Systems for Upper Gastrointestinal Bleeding

Dennis L Shung; Benjamin Au; Richard Andrew Taylor; J Kenneth Tay; Stig B Laursen; Adrian J Stanley; Harry R Dalton; Jeffrey Ngu; Michael Schultz; Loren Laine

doi:10.1053/j.gastro.2019.09.009

Validation of a Machine Learning Model That Outperforms Clinical Risk Scoring Systems for Upper Gastrointestinal Bleeding

Gastroenterology. 2020 Jan;158(1):160-167. doi: 10.1053/j.gastro.2019.09.009. Epub 2019 Sep 25.

Authors

Affiliations

¹ Yale School of Medicine, New Haven, Connecticut. Electronic address: dennis.shung@yale.edu.
² Yale School of Medicine, New Haven, Connecticut.
³ Stanford University, Palo Alto, California.
⁴ Odense University Hospital, Odense, Denmark.
⁵ Glasgow Royal Infirmary, Glasgow, United Kingdom.
⁶ Royal Cornwall Hospital, Cornwall, United Kingdom.
⁷ Christchurch Hospital, Christchurch, New Zealand.
⁸ Dunedin Hospital, Dunedin, New Zealand.
⁹ Yale School of Medicine, New Haven, Connecticut; Veterans Affairs Connecticut Healthcare System, West Haven, Connecticut. Electronic address: loren.laine@yale.edu.

Abstract

Background & aims: Scoring systems are suboptimal for determining risk in patients with upper gastrointestinal bleeding (UGIB); these might be improved by a machine learning model. We used machine learning to develop a model to calculate the risk of hospital-based intervention or death in patients with UGIB and compared its performance with other scoring systems.

Methods: We analyzed data collected from consecutive unselected patients with UGIB from medical centers in 4 countries (the United States, Scotland, England, and Denmark; n = 1958) from March 2014 through March 2015. We used the data to derive and internally validate a gradient-boosting machine learning model to identify patients who met a composite endpoint of hospital-based intervention (transfusion or hemostatic intervention) or death within 30 days. We compared the performance of the machine learning prediction model with validated pre-endoscopic clinical risk scoring systems (the Glasgow-Blatchford score [GBS], admission Rockall score, and AIMS65). We externally validated the machine learning model using data from 2 Asia-Pacific sites (Singapore and New Zealand; n = 399). Performance was measured by area under receiver operating characteristic curve (AUC) analysis.

Results: The machine learning model identified patients who met the composite endpoint with an AUC of 0.91 in the internal validation set; the clinical scoring systems identified patients who met the composite endpoint with AUC values of 0.88 for the GBS (P = .001), 0.73 for Rockall score (P < .001), and 0.78 for AIMS65 score (P < .001). In the external validation cohort, the machine learning model identified patients who met the composite endpoint with an AUC of 0.90, the GBS with an AUC of 0.87 (P = .004), the Rockall score with an AUC of 0.66 (P < .001), and the AIMS65 with an AUC of 0.64 (P < .001). At cutoff scores at which the machine learning model and GBS identified patients who met the composite endpoint with 100% sensitivity, the specificity values were 26% with the machine learning model versus 12% with GBS (P < .001).

Conclusions: We developed a machine learning model that identifies patients with UGIB who met a composite endpoint of hospital-based intervention or death within 30 days with a greater AUC and higher levels of specificity, at 100% sensitivity, than validated clinical risk scoring systems. This model could increase identification of low-risk patients who can be safely discharged from the emergency department for outpatient management.

Keywords: Artificial Intelligence; Mortality; Prediction; Prognostic Factor.

Publication types

Research Support, N.I.H., Extramural
Validation Study

MeSH terms

Adult
Aged
Aged, 80 and over
Blood Transfusion / statistics & numerical data
Emergency Service, Hospital / statistics & numerical data
Female
Gastrointestinal Hemorrhage / diagnosis*
Gastrointestinal Hemorrhage / therapy
Hemostatic Techniques / statistics & numerical data
Humans
Machine Learning*
Male
Middle Aged
Models, Biological*
Prognosis
ROC Curve
Risk Assessment / methods

Abstract

Publication types

MeSH terms

Grants and funding