Improving sentiment analysis on clinical narratives by exploiting UMLS semantic types

Artif Intell Med. 2021 Mar:113:102033. doi: 10.1016/j.artmed.2021.102033. Epub 2021 Feb 12.

Abstract

Sentiments associated with assessments and observations recorded in a clinical narrative can often indicate a patient's health status. To perform sentiment analysis on clinical narratives, domain-specific knowledge concerning meanings of medical terms is required. In this study, semantic types in the Unified Medical Language System (UMLS) are exploited to improve lexicon-based sentiment classification methods. For sentiment classification using SentiWordNet, the overall accuracy is improved from 0.582 to 0.710 by using logistic regression to determine appropriate polarity scores for UMLS 'Disorders' semantic types. For sentiment classification using a trained lexicon, when disorder terms in a training set are replaced with their semantic types, classification accuracies are improved on some data segments containing specific semantic types. To select an appropriate classification method for a given data segment, classifier combination is proposed. Using classifier combination, classification accuracies are improved on most data segments, with the overall accuracy of 0.882 being obtained.

Keywords: Classifier combination; Clinical narrative; Domain-specific knowledge; Lexicon-based sentiment analysis; Unified medical language system.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Semantics*
  • Unified Medical Language System*