An enrichment model using regular health examination data for early detection of colorectal cancer

Chin J Cancer Res. 2019 Aug;31(4):686-698. doi: 10.21147/j.issn.1000-9604.2019.04.12.

Abstract

Objective: Challenges remain in current practices of colorectal cancer (CRC) screening, such as low compliance, low specificities and expensive cost. This study aimed to identify high-risk groups for CRC from the general population using regular health examination data.

Methods: The study population consist of more than 7,000 CRC cases and more than 140,000 controls. Using regular health examination data, a model detecting CRC cases was derived by the classification and regression trees (CART) algorithm. Receiver operating characteristic (ROC) curve was applied to evaluate the performance of models. The robustness and generalization of the CART model were validated by independent datasets. In addition, the effectiveness of CART-based screening was compared with stool-based screening.

Results: After data quality control, 4,647 CRC cases and 133,898 controls free of colorectal neoplasms were used for downstream analysis. The final CART model based on four biomarkers (age, albumin, hematocrit and percent lymphocytes) was constructed. In the test set, the area under ROC curve (AUC) of the CART model was 0.88 [95% confidence interval (95% CI), 0.87-0.90] for detecting CRC. At the cutoff yielding 99.0% specificity, this model's sensitivity was 62.2% (95% CI, 58.1%-66.2%), thereby achieving a 63-fold enrichment of CRC cases. We validated the robustness of the method across subsets of test set with diverse CRC incidences, aging rates, genders ratio, distributions of tumor stages and locations, and data sources. Importantly, CART-based screening had the higher positive predictive value (1.6%) than fecal immunochemical test (0.3%).

Conclusions: As an alternative approach for the early detection of CRC, this study provides a low-cost method using regular health examination data to identify high-risk individuals for CRC for further examinations. The approach can promote early detection of CRC especially in developing countries such as China, where annual health examination is popular but regular CRC-specific screening is rare.

Keywords: Classification and regression trees; colorectal cancer; regular health examination data; routine lab test biomarkers.