A study on textual features for medical records classification

Anita Alicante; Flora Amato; Giovanni Cozzolino; Francesco Gargiulo; Nicla Improda; Antonino Mazzeo

A study on textual features for medical records classification

Stud Health Technol Inform. 2014:207:370-9.

Authors

Anita Alicante¹, Flora Amato¹, Giovanni Cozzolino¹, Francesco Gargiulo¹, Nicla Improda¹, Antonino Mazzeo¹

Affiliation

¹ Department of Electrical Engineering and Technology Information (DIETI), University of Naples Federico II.

PMID: 25488243

Abstract

Healthcare domain is characterized by a huge amount of data, contained in medical records, reports, test results and so on. In order to give support to healthcare workers and manage relevant data in effective and efficient way, it is important to correctly classify the unstructured parts of text, embedded in the medical documents. In this paper, we propose a classification system for medical records categorization, focused on the combination of different methodologies, based on lexical, syntactical and semantic analysis of the documents. We will show that a Classification System based on a combination of different text analysis methodologies overcomes the performances of each methodology taken alone. The obtained results will be presented in terms of Accuracy-Rejection Curves. Eventually, pro and cons of the architecture proposed and some future work will be pointed out.

MeSH terms

Electronic Health Records / classification*
Electronic Health Records / standards*
Humans
Terminology as Topic*