Improving sentiment analysis on clinical narratives by exploiting UMLS semantic types

Nuttapong Sanglerdsinlapachai; Anon Plangprasopchok; Tu Bao Ho; Ekawit Nantajeewarawat

doi:10.1016/j.artmed.2021.102033

Improving sentiment analysis on clinical narratives by exploiting UMLS semantic types

Artif Intell Med. 2021 Mar:113:102033. doi: 10.1016/j.artmed.2021.102033. Epub 2021 Feb 12.

Authors

Nuttapong Sanglerdsinlapachai¹, Anon Plangprasopchok², Tu Bao Ho³, Ekawit Nantajeewarawat⁴

Affiliations

¹ National Electronics and Computer Technology Center, Pathumthani, Thailand; Japan Advanced Institute of Science and Technology, Ishikawa, Japan; Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand.
² National Electronics and Computer Technology Center, Pathumthani, Thailand.
³ Japan Advanced Institute of Science and Technology, Ishikawa, Japan; John von Neumann Institute, Vietnam National University, Ho Chi Minh City, Vietnam.
⁴ Sirindhorn International Institute of Technology, Thammasat University, Pathumthani, Thailand. Electronic address: ekawit@siit.tu.ac.th.

PMID: 33685589
DOI: 10.1016/j.artmed.2021.102033

Abstract

Sentiments associated with assessments and observations recorded in a clinical narrative can often indicate a patient's health status. To perform sentiment analysis on clinical narratives, domain-specific knowledge concerning meanings of medical terms is required. In this study, semantic types in the Unified Medical Language System (UMLS) are exploited to improve lexicon-based sentiment classification methods. For sentiment classification using SentiWordNet, the overall accuracy is improved from 0.582 to 0.710 by using logistic regression to determine appropriate polarity scores for UMLS 'Disorders' semantic types. For sentiment classification using a trained lexicon, when disorder terms in a training set are replaced with their semantic types, classification accuracies are improved on some data segments containing specific semantic types. To select an appropriate classification method for a given data segment, classifier combination is proposed. Using classifier combination, classification accuracies are improved on most data segments, with the overall accuracy of 0.882 being obtained.

Keywords: Classifier combination; Clinical narrative; Domain-specific knowledge; Lexicon-based sentiment analysis; Unified medical language system.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Humans
Semantics*
Unified Medical Language System*