Generalized Extraction and Classification of Span-Level Clinical Phrases

AMIA Annu Symp Proc. 2018 Dec 5:2018:205-214. eCollection 2018.

Abstract

Much of the critical information in a patient's electronic health record (EHR) is hidden in unstructured text. As such, there is an increasing role for automated text extraction and summarization to make this information available in a way that can be quickly and easily understood. While many clinical note text extraction techniques have been examined, most existing techniques are either narrowly targeted or focus primarily on concept-level extraction, potentially missing important contextual information. In contrast, in this work we examine the extraction of several clinical categories at the phrase level, attempting to provide the necessary context while still keeping the extracted elements concise. To do so, we employ a three-stage pipeline which extracts categorized phrases of interest using clinical concepts as anchor points. Results suggest the proposed method achieves performance comparable to that of individual human annotators.

MeSH terms

  • Electronic Health Records*
  • Humans
  • Information Storage and Retrieval / methods*
  • Natural Language Processing*