Prediction of Gene Expression Patterns With Generalized Linear Regression Model

Front Genet. 2019 Mar 4:10:120. doi: 10.3389/fgene.2019.00120. eCollection 2019.

Abstract

Cell reprogramming has played important roles in medical science, such as tissue repair, organ reconstruction, disease treatment, new drug development, and new species breeding. Oct4, a core pluripotency factor, has especially played a key role in somatic cell reprogramming through transcriptional control and affects the expression level of genes by its combination intensity. However, the quantitative relationship between Oct4 combination intensity and target gene expression is still not clear. Therefore, firstly, a generalized linear regression method was constructed to predict gene expression values in promoter regions affected by Oct4 combination intensity. Training data, including Oct4 combination intensity and target gene expression, were from promoter regions of genes with different cell development stages. Additionally, the quantitative relationship between gene expression and Oct4 combination intensity was analyzed with the proposed model. Then, the quantitative relationship between gene expression and Oct4 combination intensity at each stage of cell development was classified into high and low levels. Experimental analysis showed that the combination height of Oct4-inhibited gene expression decremented by a temporal exponential value, whereas the combination width of Oct4-promoted gene expression incremented by a temporal logarithmic value. Experimental results showed that the proposed method can achieve goodness of fit with high confidence.

Keywords: Oct4; cell reprogramming; combination intensity; gene expression pattern; generalized linear regression model; prediction; transcription factor binding site (TFBS).