Personalizing lung cancer risk prediction and imaging follow-up recommendations using the National Lung Screening Trial dataset

J Am Med Inform Assoc. 2017 Nov 1;24(6):1046-1051. doi: 10.1093/jamia/ocx012.

Abstract

Objective: To demonstrate a data-driven method for personalizing lung cancer risk prediction using a large clinical dataset.

Materials and methods: An algorithm was used to categorize nodules found in the first screening year of the National Lung Screening Trial as malignant or nonmalignant. Risk of malignancy for nodules was calculated based on size criteria according to the Fleischner Society recommendations from 2005, along with the additional discriminators of pack-years smoking history, sex, and nodule location. Imaging follow-up recommendations were assigned according to Fleischner size category malignancy risk.

Results: Nodule size correlated with malignancy risk as predicted by the Fleischner Society recommendations. With the additional discriminators of smoking history, sex, and nodule location, significant risk stratification was observed. For example, men with ≥60 pack-years smoking history and upper lobe nodules measuring >4 and ≤6 mm demonstrated significantly increased risk of malignancy at 12.4% compared to the mean of 3.81% for similarly sized nodules (P < .0001). Based on personalized malignancy risk, 54% of nodules >4 and ≤6 mm were reclassified to longer-term follow-up than recommended by Fleischner. Twenty-seven percent of nodules ≤4 mm were reclassified to shorter-term follow-up.

Discussion: Using available clinical datasets such as the National Lung Screening Trial in conjunction with locally collected datasets can help clinicians provide more personalized malignancy risk predictions and follow-up recommendations.

Conclusion: By incorporating 3 demographic data points, the risk of lung nodule malignancy within the Fleischner categories can be considerably stratified and more personalized follow-up recommendations can be made.

Keywords: cancer screening; clinical decision support; data mining; lung cancer; medical informatics.

MeSH terms

  • Aged
  • Algorithms*
  • Data Mining
  • Datasets as Topic
  • Decision Support Techniques
  • Early Detection of Cancer*
  • Female
  • Follow-Up Studies
  • Humans
  • Lung Neoplasms* / diagnosis
  • Male
  • Middle Aged
  • Odds Ratio
  • Risk Assessment / methods*
  • Risk Factors
  • Smoking
  • Solitary Pulmonary Nodule / pathology*
  • United States