Machine learning approaches for predicting biomolecule-disease associations

Brief Funct Genomics. 2021 Jul 17;20(4):273-287. doi: 10.1093/bfgp/elab002.

Abstract

Biomolecules, such as microRNAs, circRNAs, lncRNAs and genes, are functionally interdependent in human cells, and all play critical roles in diverse fundamental and vital biological processes. The dysregulations of such biomolecules can cause diseases. Identifying the associations between biomolecules and diseases can uncover the mechanisms of complex diseases, which is conducive to their diagnosis, treatment, prognosis and prevention. Due to the time consumption and cost of biologically experimental methods, many computational association prediction methods have been proposed in the past few years. In this study, we provide a comprehensive review of machine learning-based approaches for predicting disease-biomolecule associations with multi-view data sources. Firstly, we introduce some databases and general strategies for integrating multi-view data sources in the prediction models. Then we discuss several feature representation methods for machine learning-based prediction models. Thirdly, we comprehensively review machine learning-based prediction approaches in three categories: basic machine learning methods, matrix completion-based methods and deep learning-based methods, while discussing their advantages and disadvantages. Finally, we provide some perspectives for further improving biomolecule-disease prediction methods.

Keywords: biomolecule–disease association; deep learning; feature representation; machine learning; multi-view data source; non-negative matrix factorization.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computational Biology
  • Humans
  • Machine Learning
  • MicroRNAs*
  • RNA, Circular
  • RNA, Long Noncoding*

Substances

  • MicroRNAs
  • RNA, Circular
  • RNA, Long Noncoding