A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning

Cancer Med. 2023 Oct;12(19):19987-19999. doi: 10.1002/cam4.6547. Epub 2023 Sep 22.

Abstract

Introduction: Pancreatic cancer is associated with poor prognosis. Considering the increased global incidence of diabetes cases and that individuals with diabetes are considered a high-risk subpopulation for pancreatic cancer, it is critical to detect the risk of pancreatic cancer within populations of person living = with diabetes. This study aimed to develop a novel prediction model for pancreatic cancer risk among patients with diabetes, using = a real-world database containing clinical features and employing numerous artificial intelligent approach algorithms.

Methods: This retrospective observational study analyzed data on patients with Type 2 diabetes from a multisite Taiwanese EMR database between 2009 and 2019. Predictors were selected in accordance with the literature review and clinical perspectives. The prediction models were constructed using machine learning algorithms such as logistic regression, linear discriminant analysis, gradient boosting machine, and random forest.

Results: The cohort consisted of 66,384 patients. The Linear Discriminant Analysis (LDA) model generated the highest AUROC of 0.9073, followed by the Voting Ensemble and Gradient Boosting machine models. LDA, the best model, exhibited an accuracy of 84.03%, a sensitivity of 0.8611, and a specificity of 0.8403. The most significant predictors identified for pancreatic cancer risk were glucose, glycated hemoglobin, hyperlipidemia comorbidity, antidiabetic drug use, and lipid-modifying drug use.

Conclusion: This study successfully developed a highly accurate 4-year risk model for pancreatic cancer in patients with diabetes using real-world clinical data and multiple machine-learning algorithms. Potentially, our predictors offer an opportunity to identify pancreatic cancer early and thus increase prevention and invention windows to impact survival in diabetic patients.

Keywords: Taipei Medical University Clinical Research Database (TMUCRD); artificial intelligence; diabetes; machine learning; pancreatic cancer; prediction model.

Publication types

  • Observational Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Diabetes Mellitus, Type 2* / complications
  • Diabetes Mellitus, Type 2* / epidemiology
  • Humans
  • Machine Learning
  • Pancreas
  • Pancreatic Neoplasms* / epidemiology
  • Pancreatic Neoplasms* / etiology