Identification of fibroblast-related genes based on single-cell and machine learning to predict the prognosis and endocrine metabolism of pancreatic cancer

Front Endocrinol (Lausanne). 2023 Jul 31:14:1201755. doi: 10.3389/fendo.2023.1201755. eCollection 2023.

Abstract

Background: Single-cell sequencing technology has become an indispensable tool in tumor mechanism and heterogeneity studies. Pancreatic adenocarcinoma (PAAD) lacks early specific symptoms, and comprehensive bioinformatics analysis for PAAD contributes to the developmental mechanisms.

Methods: We performed dimensionality reduction analysis on the single-cell sequencing data GSE165399 of PAAD to obtain the specific cell clusters. We then obtained cell cluster-associated gene modules by weighted co-expression network analysis and identified tumorigenesis-associated cell clusters and gene modules in PAAD by trajectory analysis. Tumor-associated genes of PAAD were intersected with cell cluster marker genes and within the signature module to obtain genes associated with PAAD occurrence to construct a prognostic risk assessment tool by the COX model. The performance of the model was assessed by the Kaplan-Meier (K-M) curve and the receiver operating characteristic (ROC) curve. The score of endocrine pathways was assessed by ssGSEA analysis.

Results: The PAAD single-cell dataset GSE165399 was filtered and downscaled, and finally, 17 cell subgroups were filtered and 17 cell clusters were labeled. WGCNA analysis revealed that the brown module was most associated with tumorigenesis. Among them, the brown module was significantly associated with C11 and C14 cell clusters. C11 and C14 cell clusters belonged to fibroblast and circulating fetal cells, respectively, and trajectory analysis showed low heterogeneity for fibroblast and extremely high heterogeneity for circulating fetal cells. Next, through differential analysis, we found that genes within the C11 cluster were highly associated with tumorigenesis. Finally, we constructed the RiskScore system, and K-M curves and ROC curves revealed that RiskScore possessed objective clinical prognostic potential and demonstrated consistent robustness in multiple datasets. The low-risk group presented a higher endocrine metabolism and lower immune infiltrate state.

Conclusion: We identified prognostic models consisting of APOL1, BHLHE40, CLMP, GNG12, LOX, LY6E, MYL12B, RND3, SOX4, and RiskScore showed promising clinical value. RiskScore possibly carries a credible clinical prognostic potential for PAAD.

Keywords: RiskScore; endocrine metabolism; pancreatic adenocarcinoma; prognosis; single-cell sequencing; tumorigenesis.

MeSH terms

  • Adenocarcinoma*
  • Apolipoprotein L1
  • Carcinogenesis
  • Cell Transformation, Neoplastic
  • Fibroblasts
  • Humans
  • Machine Learning
  • Pancreatic Neoplasms* / diagnosis
  • Pancreatic Neoplasms* / genetics
  • Prognosis
  • SOXC Transcription Factors

Substances

  • SOX4 protein, human
  • SOXC Transcription Factors
  • APOL1 protein, human
  • Apolipoprotein L1