Reproducing RECIST lesion selection via machine learning: Insights into intra and inter-radiologist variation

Teresa M Tareco Bucho; Liliana Petrychenko; Mohamed A Abdelatty; Nino Bogveradze; Zuhir Bodalal; Regina G H Beets-Tan; Stefano Trebeschi

doi:10.1016/j.ejro.2024.100562

Reproducing RECIST lesion selection via machine learning: Insights into intra and inter-radiologist variation

Eur J Radiol Open. 2024 Apr 17:12:100562. doi: 10.1016/j.ejro.2024.100562. eCollection 2024 Jun.

Authors

Teresa M Tareco Bucho^{1

2}, Liliana Petrychenko^{1

2}, Mohamed A Abdelatty^{1

3}, Nino Bogveradze^{1

2

4}, Zuhir Bodalal^{1

2}, Regina G H Beets-Tan^{1

2

5}, Stefano Trebeschi^{1

2}

Affiliations

¹ Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands.
² GROW School for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands.
³ Department of Radiology, Kasr Al Ainy Hospital, Cairo University, Cairo, Egypt.
⁴ Department of Radiology, American Hospital Tbilisi, Tbilisi, Georgia.
⁵ Faculty of Health Sciences, University of Southern Denmark, Denmark.

Abstract

Background: The Response Evaluation Criteria in Solid Tumors (RECIST) aims to provide a standardized approach to assess treatment response in solid tumors. However, discrepancies in the selection of measurable and target lesions among radiologists using these criteria pose a significant limitation to their reproducibility and accuracy. This study aimed to understand the factors contributing to this variability.

Methods: Machine learning models were used to replicate, in parallel, the selection process of measurable and target lesions by two radiologists in a cohort of 40 patients from an internal pan-cancer dataset. The models were trained on lesion characteristics such as size, shape, texture, rank, and proximity to other lesions. Ablation experiments were conducted to evaluate the impact of lesion diameter, volume, and rank on the selection process.

Results: The models successfully reproduced the selection of measurable lesions, relying primarily on size-related features. Similarly, the models reproduced target lesion selection, relying mostly on lesion rank. Beyond these features, the importance placed by different radiologists on different visual characteristics can vary, specifically when choosing target lesions. Worth noting that substantial variability was still observed between radiologists in both measurable and target lesion selection.

Conclusions: Despite the successful replication of lesion selection, our results still revealed significant inter-radiologist disagreement. This underscores the necessity for more precise guidelines to standardize lesion selection processes and minimize reliance on individual interpretation and experience as a means to bridge existing ambiguities.

Keywords: Cancer imaging; Machine learning; RECIST; Reproducibility; Variability.