Computational risk model for predicting 2-year malignancy of pulmonary nodules using demographic and radiographic characteristics

J Thorac Cardiovasc Surg. 2024 Jun;167(6):1910-1924.e2. doi: 10.1016/j.jtcvs.2023.09.027. Epub 2023 Sep 17.

Abstract

Objectives: To determine whether discriminatory performance of a computational risk model in classifying pulmonary lesion malignancy using demographic, radiographic, and clinical characteristics is superior to the opinion of experienced providers. We hypothesized that computational risk models would outperform providers.

Methods: Outcome of malignancy was obtained from selected patients enrolled in the NAVIGATE trial (NCT02410837). Five predictive risk models were developed using an 80:20 train-test split: univariable logistic regression model based solely on provider opinion, multivariable logistic regression model, random forest classifier, extreme gradient boosting model, and artificial neural network. Area under the receiver operating characteristic curve achieved during testing of the predictive models was compared to that of prebiopsy provider opinion baseline using the DeLong test with 10,000 bootstrapped iterations.

Results: The cohort included 984 patients, 735 (74.7%) of which were diagnosed with malignancy. Factors associated with malignancy from multivariable logistic regression included age, history of cancer, largest lesion size, lung zone, and positron-emission tomography positivity. Testing area under the receiver operating characteristic curve were 0.830 for provider opinion baseline, 0.770 for provider opinion univariable logistic regression, 0.659 for multivariable logistic regression model, 0.743 for random forest classifier, 0.740 for extreme gradient boosting, and 0.679 for artificial neural network. Provider opinion baseline was determined to be the best predictive classification system.

Conclusions: Computational models predicting malignancy of pulmonary lesions using clinical, demographic, and radiographic characteristics are inferior to provider opinion. This study questions the ability of these models to provide additional insight into patient care. Expert clinician evaluation of pulmonary lesion malignancy is paramount.

Keywords: machine learning; pulmonary lesion malignancy; risk modeling.

MeSH terms

  • Aged
  • Decision Support Techniques
  • Female
  • Humans
  • Lung Neoplasms* / diagnostic imaging
  • Lung Neoplasms* / epidemiology
  • Lung Neoplasms* / pathology
  • Male
  • Middle Aged
  • Multiple Pulmonary Nodules / diagnostic imaging
  • Multiple Pulmonary Nodules / pathology
  • Neural Networks, Computer
  • Predictive Value of Tests
  • Risk Assessment
  • Risk Factors
  • Solitary Pulmonary Nodule / diagnostic imaging
  • Solitary Pulmonary Nodule / pathology
  • Time Factors
  • Tomography, X-Ray Computed