Automated image label extraction from radiology reports - A review

Sofia C Pereira; Ana Maria Mendonça; Aurélio Campilho; Pedro Sousa; Carla Teixeira Lopes

doi:10.1016/j.artmed.2024.102814

Automated image label extraction from radiology reports - A review

Artif Intell Med. 2024 Mar:149:102814. doi: 10.1016/j.artmed.2024.102814. Epub 2024 Feb 14.

Authors

Sofia C Pereira¹, Ana Maria Mendonça², Aurélio Campilho³, Pedro Sousa⁴, Carla Teixeira Lopes⁵

Affiliations

¹ Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal. Electronic address: sofia.c.pereira@inesctec.pt.
² Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal. Electronic address: amendon@fe.up.pt.
³ Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal. Electronic address: campilho@fe.up.pt.
⁴ Hospital Center of Vila Nova de Gaia/Espinho, Portugal. Electronic address: pedro.teixeira.sousa@chvng.min-saude.pt.
⁵ Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal. Electronic address: ctl@fe.up.pt.

PMID: 38462277
DOI: 10.1016/j.artmed.2024.102814

Abstract

Machine Learning models need large amounts of annotated data for training. In the field of medical imaging, labeled data is especially difficult to obtain because the annotations have to be performed by qualified physicians. Natural Language Processing (NLP) tools can be applied to radiology reports to extract labels for medical images automatically. Compared to manual labeling, this approach requires smaller annotation efforts and can therefore facilitate the creation of labeled medical image data sets. In this article, we summarize the literature on this topic spanning from 2013 to 2023, starting with a meta-analysis of the included articles, followed by a qualitative and quantitative systematization of the results. Overall, we found four types of studies on the extraction of labels from radiology reports: those describing systems based on symbolic NLP, statistical NLP, neural NLP, and those describing systems combining or comparing two or more of the latter. Despite the large variety of existing approaches, there is still room for further improvement. This work can contribute to the development of new techniques or the improvement of existing ones.

Keywords: Image annotation; Information extraction; Literature survey; Medical imaging; Medical reports; Natural language processing.

Publication types

Meta-Analysis
Review

MeSH terms

Machine Learning
Natural Language Processing*
Radiology*