An Empirical Method of Automatic Pattern Extraction for Clinical Text Classification

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul:2020:5292-5295. doi: 10.1109/EMBC44109.2020.9176503.

Abstract

Clinical text classification is an indispensable and extensively studied problem in medical text processing. Existing research primarily employs machine learning and pattern based approaches to address the stated problem. In general, pattern based approaches perform better than other methods. However, these approaches commonly require human intervention for pattern identification, which diminish their benefits and restrain their applications. In this study, we present a novel pattern extraction algorithm, which identifies and extracts patterns from clinical textual resources, automatically. The algorithm identifies the candidate concepts in the clinical text, finds the context of the concepts by discovering their context windows, and finally transforms each context window to a pattern. We evaluate our proposed algorithm on Hypertension, Rhinosinusitis, and Asthma guidelines. 70% of the hypertension guideline was used for pattern extraction while the remaining 30% and the other two guidelines were used for evaluations. The algorithm extracts 21 patterns that classify Hypertension, Rhinosinusitis, and Asthma guidelines sentences to the recommendation and non-recommendation sentences with 84.53%, 80.03%, and 84.62% accuracy, respectively. The initial results reveal the benefits and applicability of the algorithm for clinical text classification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Humans
  • Language
  • Machine Learning*