Two-pass alignment improves novel splice junction quantification

Brendan A Veeneman; Sudhanshu Shukla; Saravana M Dhanasekaran; Arul M Chinnaiyan; Alexey I Nesvizhskii

doi:10.1093/bioinformatics/btv642

Two-pass alignment improves novel splice junction quantification

Bioinformatics. 2016 Jan 1;32(1):43-9. doi: 10.1093/bioinformatics/btv642. Epub 2015 Oct 30.

Authors

Brendan A Veeneman¹, Sudhanshu Shukla², Saravana M Dhanasekaran², Arul M Chinnaiyan³, Alexey I Nesvizhskii⁴

Affiliations

¹ Department of Computational Medicine and Bioinformatics, Michigan Center for Translational Pathology.
² Michigan Center for Translational Pathology, Department of Pathology.
³ Department of Computational Medicine and Bioinformatics, Michigan Center for Translational Pathology, Department of Pathology, Department of Urology and Howard Hughes Medical Institute, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA.
⁴ Department of Computational Medicine and Bioinformatics, Michigan Center for Translational Pathology, Department of Pathology.

Abstract

Motivation: Discovery of novel splicing from RNA sequence data remains a critical and exciting focus of transcriptomics, but reduced alignment power impedes expression quantification of novel splice junctions.

Results: Here, we profile performance characteristics of two-pass alignment, which separates splice junction discovery from quantification. Per sample, across a variety of transcriptome sequencing datasets, two-pass alignment improved quantification of at least 94% of simulated novel splice junctions, and provided as much as 1.7-fold deeper median read depth over those splice junctions. We further demonstrate that two-pass alignment works by increasing alignment of reads to splice junctions by short lengths, and that potential alignment errors are readily identifiable by simple classification. Taken together, two-pass alignment promises to advance quantification and discovery of novel splicing events.

Contact: arul@med.umich.edu, nesvi@med.umich.edu

Availability and implementation: Two-pass alignment was implemented here as sequential alignment, genome indexing, and re-alignment steps with STAR. Full parameters are provided in Supplementary Table 2.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Base Sequence
Cell Line, Tumor
Databases, Nucleic Acid
Humans
RNA Splice Sites / genetics*
RNA Splicing / genetics*
Sequence Alignment / methods*

Substances

RNA Splice Sites

Abstract

Publication types

MeSH terms

Substances

Grants and funding