Transparent Machine Learning Models to Diagnose Suspicious Thoracic Lesions Leveraging CT Guided Biopsy Data

Acad Radiol. 2022 Feb:29 Suppl 2:S156-S164. doi: 10.1016/j.acra.2021.07.002. Epub 2021 Aug 7.

Abstract

Rationale and objectives: To train and validate machine learning models capable of classifying suspicious thoracic lesions as benign or malignant and to further classify malignant lesions by pathologic subtype while quantifying feature importance for each classification.

Materials and methods: 796 patients who had undergone CT guided thoracic biopsy for a concerning thoracic lesion (79.3% lung, 11.4% mediastinum, 6.5% pleura, 2.7% chest wall) were retrospectively enrolled. Lesions were classified as malignant or benign based on ground-truth pathology result, and malignant lesions were classified as primary or secondary cancer. Clinical variables were extracted from EMR and radiology reports. Supervised binary and multiclass classification models were trained to classify lesions based on the input features and evaluated on a held-out test set. Model specific feature analyses were performed to identify variables most predictive of each class, as well as to assess the independent importance of clinical, and imaging features.

Results: Binary classification models achieved a top accuracy of 80.6%, with predictive features included smoking history, age, lesion size, and lesion location. Multiclass classification models achieved a top weighted average f1-score of 0.73. Features predictive of primary cancer included smoking history, race, and age, while features predictive of secondary cancer included lesion location, and a history of cancer.

Conclusion: Machine learning models enable classification of suspicious thoracic lesions based on clinical and imaging variables, achieving clinically useful performance while identifying importance of individual input features on a pathology-proven dataset. We believe models such as these are more likely to be trusted and adopted by clinicians.

Keywords: biopsy-proven; cancer; lung; machine learning; oncology.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Humans
  • Image-Guided Biopsy
  • Machine Learning*
  • Multiparametric Magnetic Resonance Imaging*
  • Retrospective Studies
  • Tomography, X-Ray Computed