Multidimensional Cell-Free DNA Fragmentomic Assay for Detection of Early-Stage Lung Cancer

Am J Respir Crit Care Med. 2023 May 1;207(9):1203-1213. doi: 10.1164/rccm.202109-2019OC.

Abstract

Rationale: Cell-free DNA (cfDNA) analysis holds promise for early detection of lung cancer and benefits patients with higher survival. However, the detection sensitivity of previous cfDNA-based studies was still low to suffice for clinical use, especially for early-stage tumors. Objectives: Establish an accurate and affordable approach for early-stage lung cancer detection by integrating cfDNA fragmentomics and machine learning models. Methods: This study included 350 participants without cancer and 432 participants with cancer. The participants' plasma cfDNA samples were profiled by whole-genome sequencing. Multiple cfDNA features and machine learning models were compared in the training cohort to achieve an optimal model. Model performance was evaluated in three validation cohorts. Measurements and Main Results: A stacked ensemble model integrating five cfDNA features and five machine learning algorithms constructed in the training cohort (cancer: 113; healthy: 113) outperformed all the models built on individual feature-algorithm combinations. This integrated model yielded superior sensitivities of 91.4% at 95.7% specificity for cohort validation I (area under the curve [AUC], 0.984), 84.7% at 98.6% specificity for validation II (AUC, 0.987), and 92.5% at 94.2% specificity for additional validation (AUC, 0.974), respectively. The model's high performance remained consistent when sequencing depth was down to 0.5× (AUC, 0.966-0.971). Furthermore, our model is sensitive to identifying early pathological features (83.2% sensitivity for stage I, 85.0% sensitivity for <1 cm tumor at the 0.66 cutoff). Conclusions: We have established a stacked ensemble model using cfDNA fragmentomics features and achieved superior sensitivity for detecting early-stage lung cancer, which could promote early diagnosis and benefit more patients.

Keywords: cell-free DNA; early detection; lung cancer; machine learning; whole-genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics
  • Cell-Free Nucleic Acids* / genetics
  • Humans
  • Lung
  • Lung Neoplasms* / diagnosis
  • Whole Genome Sequencing

Substances

  • Cell-Free Nucleic Acids
  • Biomarkers, Tumor