Combining Machine Learning with a Rule-Based Algorithm to Detect and Identify Related Entities of Documented Adverse Drug Reactions on Hospital Discharge Summaries

Drug Saf. 2022 Aug;45(8):853-862. doi: 10.1007/s40264-022-01196-x. Epub 2022 Jul 6.

Abstract

Introduction: Discharge summaries contain valuable information about adverse drug reactions, but their unstructured nature makes them challenging to analyse and use as a signal source for pharmacovigilance. Machine learning has shown promise in identifying discharge summaries that contain related drug-adverse event pairs but has fared relatively poorer in entity extraction.

Methods: A hybrid model is developed combining rule-based and machine learning algorithms using discharge summaries with the aim of maximising capture of related drug-adverse event pairs. The rule first identifies segments containing adverse event entities within a 100-character distance from a drug term; machine learning subsequently estimates the relatedness of the drug and adverse event entities contained. The approach is validated on four independent datasets that are temporally and geographically separated from model development data. The impact of restricted drug-adverse event pair detection on recall is evaluated by using two of the four validation datasets that do not impose rule-based restrictions to annotations.

Results: The hybrid model achieves a recall of 0.80 (fivefold cross validation), 0.80 (temporal) and 0.76 (geographical) on validation using datasets containing only pre-identified target text segments that fulfil the rule-based algorithm criteria. When tested on datasets that additionally contained drug-adverse event pairs not restricted by the rule-based criteria, recall of the model declines to 0.68 and 0.62 on temporally and geographically separated datasets, respectively.

Conclusions: The proposed hybrid model demonstrates reasonable generalisability on external validation. Rule-based restriction of the detection space results in an approximately 12-14% reduction in recall but improves identification of the related drug and adverse event terms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Drug-Related Side Effects and Adverse Reactions* / diagnosis
  • Drug-Related Side Effects and Adverse Reactions* / epidemiology
  • Hospitals
  • Humans
  • Machine Learning
  • Patient Discharge*