Adaboost-SVM-based probability algorithm for the prediction of all mature miRNA sites based on structured-sequence features

Sci Rep. 2019 Feb 6;9(1):1521. doi: 10.1038/s41598-018-38048-7.

Abstract

The significant role of microRNAs (miRNAs) in various biological processes and diseases has been widely studied and reported in recent years. Several computational methods associated with mature miRNA identification suffer various limitations involving canonical biological features extraction, class imbalance, and classifier performance. The proposed classifier, miRFinder, is an accurate alternative for the identification of mature miRNAs. The structured-sequence features were proposed to precisely extract miRNA biological features, and three algorithms were selected to obtain the canonical features based on the classifier performance. Moreover, the center of mass near distance training based on K-means was provided to improve the class imbalance problem. In particular, the AdaBoost-SVM algorithm was used to construct the classifier. The classifier training process focuses on incorrectly classified samples, and the integrated results use the common decision strategies of the weak classifier with different weights. In addition, the all mature miRNA sites were predicted by different classifiers based on the features of different sites. Compared with other methods, the performance of the classifiers has a high degree of efficacy for the identification of mature miRNAs. MiRFinder is freely available at https://github.com/wangying0128/miRFinder .

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computational Biology / methods*
  • Humans
  • MicroRNAs / analysis*
  • MicroRNAs / biosynthesis
  • MicroRNAs / chemistry
  • MicroRNAs / genetics*
  • RNA Precursors / analysis*
  • RNA Precursors / biosynthesis
  • RNA Precursors / chemistry
  • RNA Precursors / genetics*
  • Support Vector Machine*

Substances

  • MicroRNAs
  • RNA Precursors