RNA variant identification discrepancy among splice-aware alignment algorithms

PLoS One. 2018 Aug 2;13(8):e0201822. doi: 10.1371/journal.pone.0201822. eCollection 2018.

Abstract

Next-generation sequencing (NGS) techniques have been generating various molecular maps, including transcriptomes via RNA-seq. Although the primary purpose of RNA-seq is to quantify the expression level of known genes, RNA variants are also identifiable. However, care must be taken to account for RNA's dynamic nature. In this study, we evaluated the following popular splice-aware alignment algorithms in the context of RNA variant-calling analysis: HISAT2, STAR, STAR (two-pass mode), Subread, and Subjunc. For this, we performed RNA-seq with ten pieces of invasive ductal carcinoma from breast tissue and three pieces of adjacent normal tissue from a single patient. These RNA-seq data were used to evaluate the performance of splice-aware aligners. Surprisingly, the number of common potential RNA editing sites (pRESs) identified by all alignment algorithms was less than 2% of the total. The main cause of this difference was the mapped reads on the splice junctions. In addition, the RNA quality significantly affected the outcome. Therefore, researchers must consider these experimental and bioinformatic features during RNA variant analysis. Further investigations of common pRESs discovered that BDH1, CCDC137, and TBC1D10A transcripts contained a single non-synonymous RNA variant that was unique to breast cancer tissue compared to adjacent normal tissue; thus, further clinical validation is required.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Breast / metabolism
  • Breast / pathology
  • Carcinoma, Ductal, Breast / genetics
  • Carcinoma, Ductal, Breast / metabolism
  • Carcinoma, Ductal, Breast / pathology
  • Computational Biology / methods*
  • Female
  • Gene Expression Regulation, Neoplastic
  • Humans
  • RNA / metabolism
  • RNA Splicing*
  • Sequence Alignment / methods*
  • Sequence Analysis, RNA / methods*

Substances

  • RNA

Grants and funding

This study was supported by a grant from the National R&D Program for Cancer Control, Ministry of Health & Welfare, Republic of Korea (1720100).