CRPMKB: a knowledge base of cancer risk prediction models for systematic comparison and personalized applications

Bioinformatics. 2022 Mar 4;38(6):1669-1676. doi: 10.1093/bioinformatics/btab850.

Abstract

Motivation: In the era of big data and precision medicine, accurate risk assessment is a prerequisite for the implementation of risk screening and preventive treatment. A large number of studies have focused on the risk of cancer, and related risk prediction models have been constructed, but there is a lack of effective resource integration for systematic comparison and personalized applications. Therefore, the establishment and analysis of the cancer risk prediction model knowledge base (CRPMKB) is of great significance.

Results: The current knowledge base contains 802 model data. The model comparison indicates that the accuracy of cancer risk prediction was greatly affected by regional differences, cancer types and model types. We divided the model variables into four categories: environment, behavioral lifestyle, biological genetics and clinical examination, and found that there are differences in the distribution of various variables among different cancer types. Taking 50 genes involved in the lung cancer risk prediction models as an example to perform pathway enrichment analyses and the results showed that these genes were significantly enriched in p53 Signaling and Aryl Hydrocarbon Receptor Signaling pathways which are associated with cancer and specific diseases. In addition, we verified the biological significance of overlapping lung cancer genes via STRING database. CRPMKB was established to provide researchers an online tool for the future personalized model application and developing. This study of CRPMKB suggests that developing more targeted models based on specific demographic characteristics and cancer types will further improve the accuracy of cancer risk model predictions.

Availability and implementation: CRPMKB is freely available at http://www.sysbio.org.cn/CRPMKB/. The data underlying this article are available in the article and in its online supplementary material.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Big Data
  • Humans
  • Lung Neoplasms*
  • Precision Medicine
  • Risk Assessment