DM-RPIs: Predicting ncRNA-protein interactions using stacked ensembling strategy

Comput Biol Chem. 2019 Dec:83:107088. doi: 10.1016/j.compbiolchem.2019.107088. Epub 2019 Jul 6.

Abstract

ncRNA-protein interactions (ncRPIs) play an important role in a number of cellular processes, such as post-transcriptional modification, transcriptional regulation, disease progression and development. Since experimental methods are expensive and time-consuming to identify the ncRPIs, we proposed a computational method, Deep Mining ncRNA-Protein Interactions (DM-RPIs), for identifying the ncRPIs. In order to descending dimension and excavating hidden information from k-mer frequency of RNA and protein sequences, using the Deep Stacking Auto-encoders Networks (DSANs) model refined the raw data. Three common machine learning algorithms, Support Vector Machine (SVM), Random Forest (RF), and Convolution Neural Network (CNN), were separately trained as individual predictors and then the three individual predictors were integrated together using stacked ensembling strategy. Based on the RPI2241 dataset, DM-RPI obtains an accuracy of 0.851, precision of 0.852, sensitivity of 0.873, specificity of 0.826, and MCC of 0.701, which is promising and pioneering for the prediction of ncRPIs.

Keywords: Convolution Neural Network (CNN); Deep Stacking Auto-encoders Networks (DSANs); Random Forest (RF); Stacked integrate; Support Vector Machine (SVM); ncRNA-protein interactions.

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer
  • Proteins / chemistry*
  • RNA, Untranslated / chemistry*
  • Support Vector Machine

Substances

  • Proteins
  • RNA, Untranslated