Prediction Model for Pancreatic Cancer-A Population-Based Study from NHIRD

Cancers (Basel). 2022 Feb 10;14(4):882. doi: 10.3390/cancers14040882.

Abstract

(1) Background: Cancer has been the leading cause of death in Taiwan for 39 years, and among them, pancreatic cancer has been ranked seventh in the top ten cancer mortality rates for the past three years. While the incidence rate of pancreatic cancer is ranked at the bottom of the top 10 cancers, the survival rate is very low. Pancreatic cancer is one of the more difficult cancers to detect early due to the lack of early diagnostic tools. Early screening is important for the treatment of pancreatic cancer. Only a few studies have designed predictive models for pancreatic cancer. (2) Methods: The Taiwan Health Insurance Database was used in this study, covering over 99% of the population in Taiwan. The subset sample was not significantly different from the original NHIRD sample. A machine learning approach was used to develop a predictive model for pancreatic cancer disease. Four models, including logistic regression, deep neural networks, ensemble learning, and voting ensemble were used in this study. The ROC curve and a confusion matrix were used to evaluate the accuracy of the pancreatic cancer prediction models. (3) Results: The AUC of the LR model was higher than the other three models in the external testing set for all three of the factor combinations. Sensitivity was best measured by the stacking model for the first factor combinations, and specificity was best measured by the DNN model for the second factor combination. The result of the model that used only nine factors (third factor combinations) was equal to the other two factor combinations. The AUC of the previous models for the early assessment of pancreatic cancer ranged from approximately 0.57 to 0.71. The AUC of this study was higher than that of previous studies and ranged from 0.71 to 0.76, which provides higher accuracy. (4) Conclusions: This study compared the performances of LR, DNN, stacking, and voting models for pancreatic cancer prediction and constructed a pancreatic cancer prediction model with accuracy higher than that of previous studies. This predictive model will improve awareness of the risk of pancreatic cancer and give patients with pancreatic cancer a simpler tool for early screening in the golden period when the disease can still be eradicated.

Keywords: early screening; pancreatic prevention; personal health; precision health; prediction model.