Machine vs. Radiologist-Based Translations of RadLex: Implications for Multi-language Report Interoperability

Christian J Park; Paul H Yi; Hussain Al Yousif; Kenneth C Wang

doi:10.1007/s10278-022-00597-9

Machine vs. Radiologist-Based Translations of RadLex: Implications for Multi-language Report Interoperability

J Digit Imaging. 2022 Jun;35(3):660-665. doi: 10.1007/s10278-022-00597-9. Epub 2022 Feb 15.

Authors

Christian J Park¹, Paul H Yi², Hussain Al Yousif³, Kenneth C Wang⁴

Affiliations

¹ Department of Radiology, Penn State Health Milton S. Hershey Medical Center, Hershey, PA, USA. cpark@pennstatehealth.psu.edu.
² Department of Diagnostic Radiology and Nuclear Medicine, Intelligent Imaging Center, School of Medicine, University of Maryland, Baltimore, MD, USA.
³ Department of Radiology, Penn State Health Milton S. Hershey Medical Center, Hershey, PA, USA.
⁴ Imaging Service, Baltimore VA Medical Center, Baltimore, MD, USA.

Abstract

The purpose of this study was to evaluate the feasibility of translation of RadLex lexicon from English to German performed by Google Translate, using the RadLex ontology as ground truth. The same comparison was also performed for German to English translations. We determined the concordance rate of the Google Translate-rendered translations (for both English to German and German to English translations) to the official German RadLex (translations provided by the German Radiological Society) and English RadLex terms via character-by-character concordance analysis (string matching). Specific term characteristics of term character count and word count were compared between concordant and discordant translations using t-tests. Google Translate-rendered translations originally considered incongruent (2482 English terms and 2500 German terms) were then reviewed by German and English-speaking radiologists to further evaluate clinical utility. Overall success rates of both methods were calculated by adding the percentage of terms marked correct by string comparison to the percentage marked correct during manual review extrapolated to the terms that had been initially marked incorrect during string analysis. 64,632 English and 47,425 German RadLex terms were analyzed. 3507 (5.4%) of the Google Translate-rendered English to German translations were concordant with the official German RadLex terms when evaluated via character-by-character concordance. 3288 (6.9%) of the Google Translate-rendered German to English translations matched the corresponding English RadLex terms. Human review of a random sample of non-concordant machine translations revealed that 95.5% of such English to German translations were understandable, whereas 43.9% of such German to English translations were understandable. Combining both string matching and human review resulted in an overall Google Translate success rate of 95.7% for English to German translations and 47.8% for German to English translations. For certain radiologic text translation tasks, Google Translate may be a useful tool for translating multi-language radiology reports into a common language for natural language processing and subsequent labeling of datasets for machine learning. Indeed, string matching analysis alone is an incomplete method for evaluating machine translation. However, when human review of automated translation is also incorporated, measured performance improves. Additional evaluation using longer text samples and full imaging reports is needed. An apparent discordance between English to German versus German to English translation suggests that the direction of translation affects accuracy.

Keywords: Artificial intelligence; Informatics; Natural language processing; RadLex; Reports; Translation.

MeSH terms

Humans
Language*
Natural Language Processing
Radiologists
Translating*
Translations