Pseudogene-gene functional networks are prognostic of patient survival in breast cancer

BMC Med Genomics. 2020 Apr 3;13(Suppl 5):51. doi: 10.1186/s12920-020-0687-0.

Abstract

Background: Given the vast range of molecular mechanisms giving rise to breast cancer, it is unlikely universal cures exist. However, by providing a more precise prognosis for breast cancer patients through integrative models, treatments can become more individualized, resulting in more successful outcomes. Specifically, we combine gene expression, pseudogene expression, miRNA expression, clinical factors, and pseudogene-gene functional networks to generate these models for breast cancer prognostics. Establishing a LASSO-generated molecular gene signature revealed that the increased expression of genes STXBP5, GALP and LOC387646 indicate a poor prognosis for a breast cancer patient. We also found that increased CTSLP8 and RPS10P20 and decreased HLA-K pseudogene expression indicate poor prognosis for a patient. Perhaps most importantly we identified a pseudogene-gene interaction, GPS2-GPS2P1 (improved prognosis) that is prognostic where neither the gene nor pseudogene alone is prognostic of survival. Besides, miR-3923 was predicted to target GPS2 using miRanda, PicTar, and TargetScan, which imply modules of gene-pseudogene-miRNAs that are potentially functionally related to patient survival.

Results: In our LASSO-based model, we take into account features including pseudogenes, genes and candidate pseudogene-gene interactions. Key biomarkers were identified from the features. The identification of key biomarkers in combination with significant clinical factors (such as stage and radiation therapy status) should be considered as well, enabling a specific prognostic prediction and future treatment plan for an individual patient. Here we used our PseudoFuN web application to identify the candidate pseudogene-gene interactions as candidate features in our integrative models. We further identified potential miRNAs targeting those features in our models using PseudoFuN as well. From this study, we present an interpretable survival model based on LASSO and decision trees, we also provide a novel feature set which includes pseudogene-gene interaction terms that have been ignored by previous prognostic models. We find that some interaction terms for pseudogenes and genes are significantly prognostic of survival. These interactions are cross-over interactions, where the impact of the gene expression on survival changes with pseudogene expression and vice versa. These may imply more complicated regulation mechanisms than previously understood.

Conclusions: We recommend these novel feature sets be considered when training other types of prognostic models as well, which may provide more comprehensive insights into personalized treatment decisions.

Keywords: Breast cancer; Cox regression; Data integration; Database; Network analysis; Non-coding RNAs; Pseudogenes; RNA-Seq; Survival prognosis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Breast Neoplasms / genetics
  • Breast Neoplasms / mortality*
  • Breast Neoplasms / pathology
  • Computational Biology / methods*
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Gene Regulatory Networks*
  • Humans
  • Prognosis
  • Pseudogenes*
  • Survival Rate

Substances

  • Biomarkers, Tumor