Predicting miRNA-disease associations using an ensemble learning framework with resampling method

Brief Bioinform. 2022 Jan 17;23(1):bbab543. doi: 10.1093/bib/bbab543.

Abstract

Motivation: Accumulating evidences have indicated that microRNA (miRNA) plays a crucial role in the pathogenesis and progression of various complex diseases. Inferring disease-associated miRNAs is significant to explore the etiology, diagnosis and treatment of human diseases. As the biological experiments are time-consuming and labor-intensive, developing effective computational methods has become indispensable to identify associations between miRNAs and diseases.

Results: We present an Ensemble learning framework with Resampling method for MiRNA-Disease Association (ERMDA) prediction to discover potential disease-related miRNAs. Firstly, the resampling strategy is proposed for building multiple different balanced training subsets to address the challenge of sample imbalance within the database. Then, ERMDA extracts miRNA and disease feature representations by integrating miRNA-miRNA similarities, disease-disease similarities and experimentally verified miRNA-disease association information. Next, the feature selection approach is applied to reduce the redundant information and increase the diversity among these subsets. Lastly, ERMDA constructs an individual learner on each subset to yield primitive outcomes, and the soft voting method is introduced for making the final decision based on the prediction results of individual learners. A series of experimental results demonstrates that ERMDA outperforms other state-of-the-art methods on both balanced and unbalanced testing sets. Besides, case studies conducted on the three human diseases further confirm the ERMDA's prediction capability for identifying potential disease-related miRNAs. In conclusion, these experimental results demonstrate that our method can serve as an effective and reliable tool for researchers to explore the regulatory role of miRNAs in complex diseases.

Keywords: ensemble learning; feature selection; miRNA-disease association; resampling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Disease / genetics*
  • Genetic Association Studies*
  • Genetic Predisposition to Disease / genetics
  • Humans
  • Machine Learning*
  • MicroRNAs / genetics*

Substances

  • MicroRNAs