Inferring Causation in Yeast Gene Association Networks With Kernel Logistic Regression

Evol Bioinform Online. 2020 Jun 24:16:1176934320920310. doi: 10.1177/1176934320920310. eCollection 2020.

Abstract

Computational prediction of gene-gene associations is one of the productive directions in the study of bioinformatics. Many tools are developed to infer the relation between genes using different biological data sources. The association of a pair of genes deduced from the analysis of biological data becomes meaningful when it reflects the directionality and the type of reaction between genes. In this work, we follow another method to construct a causal gene co-expression network while identifying transcription factors in each pair of genes using microarray expression data. We adopt a machine learning technique based on a logistic regression model to tackle the sparsity of the network and to improve the quality of the prediction accuracy. The proposed system classifies each pair of genes into either connected or nonconnected class using the data of the correlation between these genes in the whole Saccharomyces cerevisiae genome. The accuracy of the classification model in predicting related genes was evaluated using several data sets for the yeast regulatory network. Our system achieves high performance in terms of several statistical measures.

Keywords: Bioinformatics; gene co-expression network; predictive model; transcription factor.