Analysis and Prediction of Exon Skipping Events from RNA-Seq with Sequence Information Using Rotation Forest

Int J Mol Sci. 2017 Dec 12;18(12):2691. doi: 10.3390/ijms18122691.

Abstract

In bioinformatics, exon skipping (ES) event prediction is an essential part of alternative splicing (AS) event analysis. Although many methods have been developed to predict ES events, a solution has yet to be found. In this study, given the limitations of machine learning algorithms with RNA-Seq data or genome sequences, a new feature, called RS (RNA-seq and sequence) features, was constructed. These features include RNA-Seq features derived from the RNA-Seq data and sequence features derived from genome sequences. We propose a novel Rotation Forest classifier to predict ES events with the RS features (RotaF-RSES). To validate the efficacy of RotaF-RSES, a dataset from two human tissues was used, and RotaF-RSES achieved an accuracy of 98.4%, a specificity of 99.2%, a sensitivity of 94.1%, and an area under the curve (AUC) of 98.6%. When compared to the other available methods, the results indicate that RotaF-RSES is efficient and can predict ES events with RS features.

Keywords: RNA-Seq data; exon skipping event; sequence information.

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Exons / genetics*
  • Humans
  • RNA Splicing*
  • Reproducibility of Results
  • Sequence Analysis, RNA / methods*