Automated image label extraction from radiology reports - A review

Artif Intell Med. 2024 Mar:149:102814. doi: 10.1016/j.artmed.2024.102814. Epub 2024 Feb 14.

Abstract

Machine Learning models need large amounts of annotated data for training. In the field of medical imaging, labeled data is especially difficult to obtain because the annotations have to be performed by qualified physicians. Natural Language Processing (NLP) tools can be applied to radiology reports to extract labels for medical images automatically. Compared to manual labeling, this approach requires smaller annotation efforts and can therefore facilitate the creation of labeled medical image data sets. In this article, we summarize the literature on this topic spanning from 2013 to 2023, starting with a meta-analysis of the included articles, followed by a qualitative and quantitative systematization of the results. Overall, we found four types of studies on the extraction of labels from radiology reports: those describing systems based on symbolic NLP, statistical NLP, neural NLP, and those describing systems combining or comparing two or more of the latter. Despite the large variety of existing approaches, there is still room for further improvement. This work can contribute to the development of new techniques or the improvement of existing ones.

Keywords: Image annotation; Information extraction; Literature survey; Medical imaging; Medical reports; Natural language processing.

Publication types

  • Meta-Analysis
  • Review

MeSH terms

  • Machine Learning
  • Natural Language Processing*
  • Radiology*