Comprehensive landscape of subtype-specific coding and non-coding RNA transcripts in breast cancer

Oncotarget. 2016 Oct 18;7(42):68851-68863. doi: 10.18632/oncotarget.11998.

Abstract

Molecular classification of breast cancer into clinically relevant subtypes helps improve prognosis and adjuvant-treatment decisions. The aim of this study is to provide a better characterization of the molecular subtypes by providing a comprehensive landscape of subtype-specific isoforms including coding, long non-coding RNA and microRNA transcripts. Isoform-level expression of all coding and non-coding RNAs is estimated from RNA-sequence data of 1168 breast samples obtained from The Cancer Genome Atlas (TCGA) project. We then search the whole transcriptome systematically for subtype-specific isoforms using a novel algorithm based on a robust quasi-Poisson model. We discover 5451 isoforms specific to single subtypes. A total of 27% of the subtype-specific isoforms have better accuracy in classifying the intrinsic subtypes than that of their corresponding genes. We find three subtype-specific miRNA and 707 subtype-specific long non-coding RNAs. The isoforms from long non-coding RNAs also show high performance for separation between Luminal A and Luminal B subtypes with an AUC of 0.97 in the discovery set and 0.90 in the validation set. In addition, we discover 1500 isoforms preferentially co-expressed in two subtypes, including 369 isoforms co-expressed in both Normal-like and Basal subtypes, which are commonly considered to have distinct ER-receptor status. Finally, analyses at protein level reveal four subtype-specific proteins and two subtype co-expression proteins that successfully validate results from the isoform level.

Keywords: RNA sequencing; breast cancer; non-coding RNAs; subtype co-expression; subtype-specific isoforms.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Breast Neoplasms / genetics*
  • Female
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Prognosis
  • Proteomics
  • RNA, Long Noncoding / genetics*
  • RNA, Neoplasm / genetics*
  • Receptor, Angiotensin, Type 1 / genetics
  • Transcriptome

Substances

  • AGTR1 protein, human
  • RNA, Long Noncoding
  • RNA, Neoplasm
  • Receptor, Angiotensin, Type 1