Natural Language Processing of Radiology Text Reports: Interactive Text Classification

Walter F Wiggins; Felipe Kitamura; Igor Santos; Luciano M Prevedello

doi:10.1148/ryai.2021210035

Natural Language Processing of Radiology Text Reports: Interactive Text Classification

Radiol Artif Intell. 2021 May 12;3(4):e210035. doi: 10.1148/ryai.2021210035. eCollection 2021 Jul.

Authors

Walter F Wiggins¹, Felipe Kitamura¹, Igor Santos¹, Luciano M Prevedello¹

Affiliation

¹ Department of Radiology, Duke University Health System, Duke University Hospital, Box 3808, 2301 Erwin Rd, Durham, NC 27710 (W.F.W.); Department of Diagnostic Imaging, Universidade Federal de São Paulo, Escola Paulista de Medicina, São Paulo, Brazil (F.K., I.S.); Head of AI, Diagnósticos da América SA (DASA), São Paulo, Brazil (F.K.); FIDI, NESS Health, São Paulo, Brazil (I.S.); and Department of Radiology, Ohio State University, Columbus, Ohio (L.M.P.).

Abstract

This report presents a hands-on introduction to natural language processing (NLP) of radiology reports with deep neural networks in Google Colaboratory (Colab) to introduce readers to the rapidly evolving field of NLP. The implementation of the Google Colab notebook was designed with code hidden to facilitate learning for noncoders (ie, individuals with little or no computer programming experience). The data used for this module are the corpus of radiology reports from the Indiana University chest x-ray collection available from the National Library of Medicine's Open-I service. The module guides learners through the process of exploring the data, splitting the data for model training and testing, preparing the data for NLP analysis, and training a deep NLP model to classify the reports as normal or abnormal. Concepts in NLP, such as tokenization, numericalization, language modeling, and word embeddings, are demonstrated in the module. The module is implemented in a guided fashion with the authors presenting the material and explaining concepts. Interactive features and extensive text commentary are provided directly in the notebook to facilitate self-guided learning and experimentation with the module. Keywords: Neural Networks, Negative Expression Recognition, Natural Language Processing, Computer Applications, Informatics © RSNA, 2021.

Keywords: Computer Applications; Informatics; Natural Language Processing; Negative Expression Recognition; Neural Networks.

2021 by the Radiological Society of North America, Inc.