FusionGDB: fusion gene annotation DataBase

Nucleic Acids Res. 2019 Jan 8;47(D1):D994-D1004. doi: 10.1093/nar/gky1067.

Abstract

Gene fusion is one of the hallmarks of cancer genome via chromosomal rearrangement initiated by DNA double-strand breakage. To date, many fusion genes (FGs) have been established as important biomarkers and therapeutic targets in multiple cancer types. To better understand the function of FGs in cancer types and to promote the discovery of clinically relevant FGs, we built FusionGDB (Fusion Gene annotation DataBase) available at https://ccsm.uth.edu/FusionGDB. We collected 48 117 FGs across pan-cancer from three representative fusion gene resources: the improved database of chimeric transcripts and RNA-seq data (ChiTaRS 3.1), an integrative resource for cancer-associated transcript fusions (TumorFusions), and The Cancer Genome Atlas (TCGA) fusions by Gao et al. For these ∼48K FGs, we performed functional annotations including gene assessment across pan-cancer fusion genes, open reading frame (ORF) assignment, and retention search of 39 protein features based on gene structures of multiple isoforms with different breakpoints. We also provided the fusion transcript and amino acid sequences according to multiple breakpoints and transcript isoforms. Our analyses identified 331, 303 and 667 in-frame FGs with retaining kinase, DNA-binding, and epigenetic factor domains, respectively, as well as 976 FGs lost protein-protein interaction. FusionGDB provides six categories of annotations: FusionGeneSummary, FusionProtFeature, FusionGeneSequence, FusionGenePPI, RelatedDrug and RelatedDisease.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Databases, Genetic*
  • Gene Fusion*
  • Molecular Sequence Annotation
  • Mutant Chimeric Proteins / chemistry
  • Mutant Chimeric Proteins / genetics*
  • Mutant Chimeric Proteins / metabolism
  • Neoplasms / genetics*
  • Oncogene Proteins, Fusion / chemistry
  • Oncogene Proteins, Fusion / genetics
  • Open Reading Frames
  • Protein Interaction Mapping
  • User-Computer Interface

Substances

  • Mutant Chimeric Proteins
  • Oncogene Proteins, Fusion