milRNApredictor: Genome-free prediction of fungi milRNAs by incorporating k-mer scheme and distance-dependent pair potential

Genomics. 2020 May;112(3):2233-2240. doi: 10.1016/j.ygeno.2019.12.019. Epub 2019 Dec 26.

Abstract

MicroRNA-like small RNAs (milRNAs) with length of 21-22 nucleotides are a type of small non-coding RNAs that are firstly found in Neurospora crassa in 2010. Identifying milRNAs of species without genomic information is a difficult problem. Here, knowledge-based energy features are developed to identify milRNAs by tactfully incorporating k-mer scheme and distance-dependent pair potential. Compared with k-mer scheme, features developed here can alleviate the inherent curse of dimensionality in k-scheme once k becomes large. In addition, milRNApredictor built on novel features performs comparably to k-mer scheme, and achieves sensitivity of 74.21%, and specificity of 75.72% based on 10-fold cross-validation. Furthermore, for novel miRNA prediction, there exists high overlap of results from milRNApredictor and state-of-the-art mirnovo. However, milRNApredictor is simpler to use with reduced requirements of input data and dependencies. Taken together, milRNApredictor can be used to de novo identify fungi milRNAs and other very short small RNAs of non-model organisms.

Keywords: Knowledge-based energy feature; MiRNA; Prediction; Random forest; milRNA.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • High-Throughput Nucleotide Sequencing
  • Humans
  • MicroRNAs / chemistry*
  • RNA, Fungal / chemistry*
  • Sequence Analysis, RNA / methods*
  • Software

Substances

  • MicroRNAs
  • RNA, Fungal