Development of a prediction model for pancreatic cancer in patients with type 2 diabetes using logistic regression and artificial neural network models

Cancer Manag Res. 2018 Nov 26:10:6317-6324. doi: 10.2147/CMAR.S180791. eCollection 2018.

Abstract

Objectives: Patients with type 2 diabetes (T2DM) are suggested to have a higher risk of developing pancreatic cancer. We used two models to predict pancreatic cancer risk among patients with T2DM.

Methods: The original data used for this investigation were retrieved from the National Health Insurance Research Database of Taiwan. The prediction models included the available possible risk factors for pancreatic cancer. The data were split into training and test sets: 97.5% of the data were used as the training set and 2.5% of the data were used as the test set. Logistic regression (LR) and artificial neural network (ANN) models were implemented using Python (Version 3.7.0). The F 1, precision, and recall were compared between the LR and the ANN models. The areas under the receiver operating characteristic (ROC) curves of the prediction models were also compared.

Results: The metrics used in this study indicated that the LR model more accurately predicted pancreatic cancer than the ANN model. For the LR model, the area under the ROC curve in the prediction of pancreatic cancer was 0.727, indicating a good fit.

Conclusion: Using this LR model, our results suggested that we could appropriately predict pancreatic cancer risk in patients with T2DM in Taiwan.

Keywords: artificial neural network; logistic regression; pancreatic cancer; type 2 diabetes.