Sensitive gene fusion detection using ambiguously mapping RNA-Seq read pairs

Bioinformatics. 2011 Apr 15;27(8):1068-75. doi: 10.1093/bioinformatics/btr085. Epub 2011 Feb 16.

Abstract

Motivation: Paired-end whole transcriptome sequencing provides evidence for fusion transcripts. However, due to the repetitiveness of the transcriptome, many reads have multiple high-quality mappings. Previous methods to find gene fusions either ignored these reads or required additional longer single reads. This can obscure up to 30% of fusions and unnecessarily discards much of the data.

Results: We present a method for using paired-end reads to find fusion transcripts without requiring unique mappings or additional single read sequencing. Using simulated data and data from tumors and cell lines, we show that our method can find fusions with ambiguously mapping read pairs without generating numerous spurious fusions from the many mapping locations.

Availability: A C++ and Python implementation of the method demonstrated in this article is available at http://exon.ucsd.edu/ShortFuse.

Contact: mckinsel@ucsd.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Gene Expression Profiling
  • Gene Fusion*
  • Humans
  • Male
  • Prostatic Neoplasms / genetics
  • RNA, Messenger / chemistry
  • Sequence Analysis, RNA / methods*

Substances

  • RNA, Messenger