OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations

Zoucheng Pan; Ruyang Zhang; Sipeng Shen; Yunzhi Lin; Longyao Zhang; Xiang Wang; Qian Ye; Xuan Wang; Jiajin Chen; Yang Zhao; David C Christiani; Yi Li; Feng Chen; Yongyue Wei

doi:10.1016/j.ebiom.2023.104443

OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations

EBioMedicine. 2023 Feb:88:104443. doi: 10.1016/j.ebiom.2023.104443. Epub 2023 Jan 24.

Authors

Zoucheng Pan¹, Ruyang Zhang¹, Sipeng Shen¹, Yunzhi Lin¹, Longyao Zhang¹, Xiang Wang¹, Qian Ye¹, Xuan Wang¹, Jiajin Chen¹, Yang Zhao¹, David C Christiani², Yi Li³, Feng Chen⁴, Yongyue Wei⁵

Affiliations

¹ Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, China.
² Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA; Pulmonary and Critical Care Division, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA.
³ Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
⁴ Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, China. Electronic address: fengchen@njmu.edu.cn.
⁵ Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, 211166, China; Peking University Center for Public Health and Epidemic Preparedness & Response, Xueyuan Road, Haidian District, Beijing 100191, China. Electronic address: ywei@pku.edu.cn.

Abstract

Background: A reliable risk prediction model is critically important for identifying individuals with high risk of developing lung cancer as candidates for low-dose chest computed tomography (LDCT) screening. Leveraging a cutting-edge machine learning technique that accommodates a wide list of questionnaire-based predictors, we sought to optimize and validate a lung cancer prediction model.

Methods: We developed an Optimized early Warning model for Lung cancer risk (OWL) using the XGBoost algorithm with 323,344 participants from the England area in UK Biobank (training set), and independently validated it with 93,227 participants from UKB Scotland and Wales area (validation set 1), as well as 70,605 and 66,231 participants in the Prostate, Lung, Colorectal, and Ovarian cancer screening trial (PLCO) control and intervention subpopulations, respectively (validation sets 2 & 3) and 23,138 and 18,669 participants in the United States National Lung Screening Trial (NLST) control and intervention subpopulations, respectively (validation sets 4 & 5). By comparing with three competitive prediction models, i.e., PLCO modified 2012 (PLCO_m2012), PLCO modified 2014 (PLCO_all2014), and the Liverpool Lung cancer Project risk model version 3 (LLPv3), we assessed the discrimination of OWL by the area under receiver operating characteristic curve (AUC) at the designed time point. We further evaluated the calibration using relative improvement in the ratio of expected to observed lung cancer cases (RI_EO), and illustrated the clinical utility by the decision curve analysis.

Findings: For general population, with validation set 1, OWL (AUC = 0.855, 95% CI: 0.829-0.880) presented a better discriminative capability than PLCO_all2014 (AUC = 0.821, 95% CI: 0.794-0.848) (p < 0.001); with validation sets 2 & 3, AUC of OWL was comparable to PLCO_all2014 (AUC_PLCOall2014-AUC_OWL < 1%). For ever-smokers, OWL outperformed PLCO_m2012 and PLCO_all2014 among ever-smokers in validation set 1 (AUC_OWL = 0.842, 95% CI: 0.814-0.871; AUC_PLCOm2012 = 0.792, 95% CI: 0.760-0.823; AUC_PLCOall2014 = 0.791, 95% CI: 0.760-0.822, all p < 0.001). OWL remained comparable to PLCO_m2012 and PLCO_all2014 in discrimination (AUC difference from -0.014 to 0.008) among the ever-smokers in validation sets 2 to 5. In all the validation sets, OWL outperformed LLPv3 among the general population and the ever-smokers. Of note, OWL showed significantly better calibration than PLCO_m2012, PLCO_all2014 (RI_EO from 43.1% to 92.3%, all p < 0.001), and LLPv3 (RI_EO from 41.4% to 98.7%, all p < 0.001) in most cases. For clinical utility, OWL exhibited significant improvement in average net benefits (NB) over PLCO_all2014 in validation set 1 (NB improvement: 32, p < 0.001); among ever smokers of validation set 1, OWL (average NB = 289) retained significant improvement over PLCO_m2012 (average NB = 213) (p < 0.001). OWL had equivalent NBs with PLCO_m2012 and PLCO_all2014 in PLCO and NLST populations, while outperforming LLPv3 in the three populations.

Interpretation: OWL, with a high degree of predictive accuracy and robustness, is a general framework with scientific justifications and clinical utility that can aid in screening individuals with high risks of lung cancer.

Funding: National Natural Science Foundation of China, the US NIH.

Keywords: External validation; Lung cancer; Machine learning; Risk prediction; UK Biobank.

MeSH terms

Biological Specimen Banks
Early Detection of Cancer / methods
England
Humans
Lung
Lung Neoplasms* / diagnostic imaging
Lung Neoplasms* / epidemiology
Male
Mass Screening / methods
Risk Assessment / methods
Smoking
United States

Abstract

MeSH terms

Grants and funding