Deep Collaborative Filtering for Prediction of Disease Genes

Xiangxiang Zeng; Yinglai Lin; Yuying He; Linyuan Lu; Xiaoping Min; Alfonso Rodriguez-Paton

doi:10.1109/TCBB.2019.2907536

Deep Collaborative Filtering for Prediction of Disease Genes

IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1639-1647. doi: 10.1109/TCBB.2019.2907536. Epub 2019 Mar 26.

Authors

Xiangxiang Zeng, Yinglai Lin, Yuying He, Linyuan Lu, Xiaoping Min, Alfonso Rodriguez-Paton

PMID: 30932845
DOI: 10.1109/TCBB.2019.2907536

Abstract

Accurate prioritization of potential disease genes is a fundamental challenge in biomedical research. Various algorithms have been developed to solve such problems. Inductive Matrix Completion (IMC) is one of the most reliable models for its well-established framework and its superior performance in predicting gene-disease associations. However, the IMC method does not hierarchically extract deep features, which might limit the quality of recovery. In this case, the architecture of deep learning, which obtains high-level representations and handles noises and outliers presented in large-scale biological datasets, is introduced into the side information of genes in our Deep Collaborative Filtering (DCF) model. Further, for lack of negative examples, we also exploit Positive-Unlabeled (PU) learning formulation to low-rank matrix completion. Our approach achieves substantially improved performance over other state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database. Our approach is 10 percent more efficient than standard IMC in detecting a true association, and significantly outperforms other alternatives in terms of the precision-recall metric at the top-k predictions. Moreover, we also validate the disease with no previously known gene associations and newly reported OMIM associations. The experimental results show that DCF is still satisfactory for ranking novel disease phenotypes as well as mining unexplored relationships. The source code and the data are available at https://github.com/xzenglab/DCF.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Animals
Computational Biology / methods*
Databases, Genetic
Deep Learning*
Disease / genetics*
Genes / genetics
Genetic Association Studies / methods*
Humans
Mice