Predicting Metabolite-Disease Associations Based on LightGBM Model

Front Genet. 2021 Apr 13:12:660275. doi: 10.3389/fgene.2021.660275. eCollection 2021.

Abstract

Metabolites have been shown to be closely related to the occurrence and development of many complex human diseases by a large number of biological experiments; investigating their correlation mechanisms is thus an important topic, which attracts many researchers. In this work, we propose a computational method named LGBMMDA, which is based on the Light Gradient Boosting Machine (LightGBM) to predict potential metabolite-disease associations. This method extracts the features from statistical measures, graph theoretical measures, and matrix factorization results, utilizing the principal component analysis (PCA) process to remove noise or redundancy. We evaluated our method compared with other used methods and demonstrated the better areas under the curve (AUCs) of LGBMMDA. Additionally, three case studies deeply confirmed that LGBMMDA has obvious superiority in predicting metabolite-disease pairs and represents a powerful bioinformatics tool.

Keywords: computational method; features; light gradient boosting machine; metabolite-disease associations; performance evaluation.