Motivation: Discovery of novel splicing from RNA sequence data remains a critical and exciting focus of transcriptomics, but reduced alignment power impedes expression quantification of novel splice junctions.
Results: Here, we profile performance characteristics of two-pass alignment, which separates splice junction discovery from quantification. Per sample, across a variety of transcriptome sequencing datasets, two-pass alignment improved quantification of at least 94% of simulated novel splice junctions, and provided as much as 1.7-fold deeper median read depth over those splice junctions. We further demonstrate that two-pass alignment works by increasing alignment of reads to splice junctions by short lengths, and that potential alignment errors are readily identifiable by simple classification. Taken together, two-pass alignment promises to advance quantification and discovery of novel splicing events.
Contact: arul@med.umich.edu, nesvi@med.umich.edu
Availability and implementation: Two-pass alignment was implemented here as sequential alignment, genome indexing, and re-alignment steps with STAR. Full parameters are provided in Supplementary Table 2.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.