Using natural language processing of clinical text to enhance identification of opioid-related overdoses in electronic health records data

Brian Hazlehurst; Carla A Green; Nancy A Perrin; John Brandes; David S Carrell; Andrew Baer; Angela DeVeaugh-Geiss; Paul M Coplan

doi:10.1002/pds.4810

Using natural language processing of clinical text to enhance identification of opioid-related overdoses in electronic health records data

Pharmacoepidemiol Drug Saf. 2019 Aug;28(8):1143-1151. doi: 10.1002/pds.4810. Epub 2019 Jun 19.

Authors

Brian Hazlehurst¹, Carla A Green¹, Nancy A Perrin¹, John Brandes¹, David S Carrell², Andrew Baer³, Angela DeVeaugh-Geiss⁴, Paul M Coplan^{4

5}

Affiliations

¹ Center for Health Research, Kaiser Permanente Northwest, Portland, OR.
² Health Research Institute, Kaiser Permanente Washington, Seattle, WA.
³ Group Health Research Institute, Group Health Cooperative, Seattle, WA.
⁴ Epidemiology, Medical Affairs, Purdue Pharma, LP, Stamford, CT.
⁵ Adjunct, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.

Abstract

Purpose: To enhance automated methods for accurately identifying opioid-related overdoses and classifying types of overdose using electronic health record (EHR) databases.

Methods: We developed a natural language processing (NLP) software application to code clinical text documentation of overdose, including identification of intention for self-harm, substances involved, substance abuse, and error in medication usage. Using datasets balanced with cases of suspected overdose and records of individuals at elevated risk for overdose, we developed and validated the application using Kaiser Permanente Northwest data, then tested portability of the application using Kaiser Permanente Washington data. Datasets were chart-reviewed to provide a gold standard for comparison and evaluation of the automated method.

Results: The method performed well in identifying overdose (sensitivity = 0.80, specificity = 0.93), intentional overdose (sensitivity = 0.81, specificity = 0.98), and involvement of opioids (excluding heroin, sensitivity = 0.72, specificity = 0.96) and heroin (sensitivity = 0.84, specificity = 1.0). The method performed poorly at identifying adverse drug reactions and overdose due to patient error and fairly at identifying substance abuse in opioid-related unintentional overdose (sensitivity = 0.67, specificity = 0.96). Evaluation using validation datasets yielded significant reductions, in specificity and negative predictive values only, for many classifications mentioned above. However, these measures remained above 0.80, thus, performance observed during development was largely maintained during validation. Similar results were obtained when evaluating portability, although there was a significant reduction in sensitivity for unintentional overdose that was attributed to missing text clinical notes in the database.

Conclusions: Methods that process text clinical notes show promise for improving accuracy and fidelity at identifying and classifying overdoses according to type using EHR data.

Trial registration: ClinicalTrials.gov NCT02667197.

Keywords: electronic health records; methods; natural language processing; opioid overdose; pharmacoepidemiology.

Publication types

Research Support, Non-U.S. Gov't
Validation Study

MeSH terms

Analgesics, Opioid / poisoning*
Datasets as Topic
Drug Overdose / epidemiology*
Electronic Health Records / statistics & numerical data
Heroin / poisoning
Humans
Natural Language Processing*
Opioid-Related Disorders / complications*
Predictive Value of Tests
Risk
Self-Injurious Behavior / epidemiology
Sensitivity and Specificity
Washington

Substances

Analgesics, Opioid
Heroin

Associated data

ClinicalTrials.gov/NCT02667197