Transcript-targeted analysis reveals isoform alterations and double-hop fusions in breast cancer

Commun Biol. 2021 Nov 22;4(1):1320. doi: 10.1038/s42003-021-02833-4.

Abstract

Although transcriptome alteration is an essential driver of carcinogenesis, the effects of chromosomal structural alterations on the cancer transcriptome are not yet fully understood. Short-read transcript sequencing has prevented researchers from directly exploring full-length transcripts, forcing them to focus on individual splice sites. Here, we develop a pipeline for Multi-Sample long-read Transcriptome Assembly (MuSTA), which enables construction of a transcriptome from long-read sequence data. Using the constructed transcriptome as a reference, we analyze RNA extracted from 22 clinical breast cancer specimens. We identify a comprehensive set of subtype-specific and differentially used isoforms, which extended our knowledge of isoform regulation to unannotated isoforms including a short form TNS3. We also find that the exon-intron structure of fusion transcripts depends on their genomic context, and we identify double-hop fusion transcripts that are transcribed from complex structural rearrangements. For example, a double-hop fusion results in aberrant expression of an endogenous retroviral gene, ERVFRD-1, which is normally expressed exclusively in placenta and is thought to protect fetus from maternal rejection; expression is elevated in several TCGA samples with ERVFRD-1 fusions. Our analyses provide direct evidence that full-length transcript sequencing of clinical samples can add to our understanding of cancer biology and genomics in general.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics*
  • Breast Neoplasms / metabolism
  • Gene Fusion*
  • Humans
  • Protein Isoforms / metabolism
  • RNA / analysis
  • Tensins / genetics
  • Tensins / metabolism
  • Transcriptome*

Substances

  • Protein Isoforms
  • TNS3 protein, human
  • Tensins
  • RNA