A classifier system for predicting RNA secondary structure

Int J Bioinform Res Appl. 2014;10(3):307-20. doi: 10.1504/IJBRA.2014.060764.

Abstract

Finding the secondary structures of ribonucleic acid sequences is a very important task. The secondary structure helps determine their functionalities which in turn plays a role in the proteins production. Manual laboratory methods use X-ray diffraction to predict secondary structures but it is complex, slow and expensive. Therefore, different computational approaches are used to predict RNA secondary structure in order to reduce the time and cost associated with the manual process. We propose a system called IsRNA to predict a single element, internal loop, of the RNA secondary structure. IsRNA experiments with different classifiers such as SVM, KNN, Naive Bayes and Simple K means to find the most accurate classifier. We present a through experimental evaluation of 24 features, classified into five groups, to determine the most relevant feature groups. The system is evaluated using Rfam sequences and achieves an overall sensitivity, selectivity, and accuracy of 96.1%, 98%, and 96.1%, respectively.

Keywords: KNN; RNA secondary structure; RNA sequences; SVM; bioinformatics; classifiers; internal loop; k–nearest neighbour; naive Bayes; protein production; ribonucleic acid; simple K–means; support vector machine.

MeSH terms

  • Artificial Intelligence
  • Base Sequence
  • Computer Simulation
  • Models, Chemical*
  • Models, Molecular*
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Pattern Recognition, Automated / methods
  • RNA / chemistry*
  • RNA / genetics
  • RNA / ultrastructure*
  • Sequence Analysis, RNA / methods*

Substances

  • RNA