Machine Learning-Assisted Evaluation of Circulating DNA Quantitative Analysis for Cancer Screening

Adv Sci (Weinh). 2020 Jul 29;7(18):2000486. doi: 10.1002/advs.202000486. eCollection 2020 Sep.

Abstract

While the utility of circulating cell-free DNA (cfDNA) in cancer screening and early detection have recently been investigated by testing genetic and epigenetic alterations, here, an original approach by examining cfDNA quantitative and structural features is developed. First, the potential of cfDNA quantitative and structural parameters is independently demonstrated in cell culture, murine, and human plasma models. Subsequently, these variables are evaluated in a large retrospective cohort of 289 healthy individuals and 983 patients with various cancer types; after age resampling, this evaluation is done independently and the variables are combined using a machine learning approach. Implementation of a decision tree prediction model for the detection and classification of healthy and cancer patients shows unprecedented performance for 0, I, and II colorectal cancer stages (specificity, 0.89 and sensitivity, 0.72). Consequently, the methodological proof of concept of using both quantitative and structural biomarkers, and classification with a machine learning method are highlighted, as an efficient strategy for cancer screening. It is foreseen that the classification rate may even be improved by the addition of such biomarkers to fragmentomics, methylation, or the detection of genetic alterations. The optimization of such a multianalyte strategy with this machine learning method is therefore warranted.

Keywords: cancer; circulating DNA; early diagnosis; machine learning; screening.