Two-pass alignment improves novel splice junction quantification

Bioinformatics. 2016 Jan 1;32(1):43-9. doi: 10.1093/bioinformatics/btv642. Epub 2015 Oct 30.

Abstract

Motivation: Discovery of novel splicing from RNA sequence data remains a critical and exciting focus of transcriptomics, but reduced alignment power impedes expression quantification of novel splice junctions.

Results: Here, we profile performance characteristics of two-pass alignment, which separates splice junction discovery from quantification. Per sample, across a variety of transcriptome sequencing datasets, two-pass alignment improved quantification of at least 94% of simulated novel splice junctions, and provided as much as 1.7-fold deeper median read depth over those splice junctions. We further demonstrate that two-pass alignment works by increasing alignment of reads to splice junctions by short lengths, and that potential alignment errors are readily identifiable by simple classification. Taken together, two-pass alignment promises to advance quantification and discovery of novel splicing events.

Contact: arul@med.umich.edu, nesvi@med.umich.edu

Availability and implementation: Two-pass alignment was implemented here as sequential alignment, genome indexing, and re-alignment steps with STAR. Full parameters are provided in Supplementary Table 2.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Cell Line, Tumor
  • Databases, Nucleic Acid
  • Humans
  • RNA Splice Sites / genetics*
  • RNA Splicing / genetics*
  • Sequence Alignment / methods*

Substances

  • RNA Splice Sites