Machine learning for phenotyping opioid overdose events

Jonathan Badger; Eric LaRose; John Mayer; Fereshteh Bashiri; David Page; Peggy Peissig

doi:10.1016/j.jbi.2019.103185

Machine learning for phenotyping opioid overdose events

J Biomed Inform. 2019 Jun:94:103185. doi: 10.1016/j.jbi.2019.103185. Epub 2019 Apr 25.

Authors

Jonathan Badger¹, Eric LaRose², John Mayer², Fereshteh Bashiri², David Page³, Peggy Peissig²

Affiliations

¹ Marshfield Clinic Research Institute, Marshfield, WI, USA; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA. Electronic address: badger.jonathan@marshfieldresearch.org.
² Marshfield Clinic Research Institute, Marshfield, WI, USA.
³ Department of Computer Sciences, University of Wisconsin, Madison, WI, USA; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA.

Abstract

Objective: To develop machine learning models for classifying the severity of opioid overdose events from clinical data.

Materials and methods: Opioid overdoses were identified by diagnoses codes from the Marshfield Clinic population and assigned a severity score via chart review to form a gold standard set of labels. Three primary feature sets were constructed from disparate data sources surrounding each event and used to train machine learning models for phenotyping.

Results: Random forest and penalized logistic regression models gave the best performance with cross-validated mean areas under the ROC curves (AUCs) for all severity classes of 0.893 and 0.882 respectively. Features derived from a common data model outperformed features collected from disparate data sources for the same cohort of patients (AUCs 0.893 versus 0.837, p value = 0.002). The addition of features extracted from free text to machine learning models also increased AUCs from 0.827 to 0.893 (p value < 0.0001). Key word features extracted using natural language processing (NLP) such as 'Narcan' and 'Endotracheal Tube' are important for classifying overdose event severity.

Conclusion: Random forest models using features derived from a common data model and free text can be effective for classifying opioid overdose events.

Keywords: Electronic health record; Machine learning; Opioid; Overdose; Phenotype.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Analgesics, Opioid / administration & dosage*
Drug Overdose*
Electronic Health Records
Humans
Machine Learning*
Phenotype*
Severity of Illness Index

Substances

Analgesics, Opioid

Abstract

Publication types

MeSH terms

Substances

Grants and funding