A duplication growth model of gene expression networks

Bioinformatics. 2002 Nov;18(11):1486-93. doi: 10.1093/bioinformatics/18.11.1486.

Abstract

Motivation: There has been considerable interest in developing computational techniques for inferring genetic regulatory networks from whole-genome expression profiles. When expression time series data sets are available, dynamic models can, in principle, be used to infer correlative relationships between gene expression levels, which may be causal. However, because of the range of detectable expression levels and the current quality of the data, the predictive nature of such inferred, quantitative models is questionable. Network models derived from simple rate laws offer an intermediate level analysis, going beyond simple statistical analysis, but falling short of a fully quantitative description. This work shows how such network models can be constructed and describes the global properties of the networks derived from such a model. These global properties are statistically robust and provide insights into the design of the underlying network.

Results: Several whole-genome expression time series data sets from yeast microarray experiments were analyzed using a Markov-modeling method (Dewey and Galas, FUNC: Integr. Genomics, 1, 269-278, 2001) to infer an approximation to the underlying genetic network. We found that the global statistical properties of all the resulting networks are similar. The overall structure of these biological networks is distinctly different from that of other recently studied networks such as the Internet or social networks. These biological networks show hierarchical, hub-like structures that have some properties similar to a class of graphs known as small world graphs. Small world networks exhibit local cliquishness while exhibiting strong global connectivity. In addition to the small world properties, the biological networks show a power law or scale free distribution of connectivities. An inverse power law, N(k) approximately k(-3/2), for the number of vertices (genes) with k connections was observed for three different data sets from yeast. We propose network growth models based on gene duplication events. Simulations of these models yield networks with the same combination of global graphical properties that we inferred from the expression data.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Validation Study

MeSH terms

  • Algorithms
  • Cell Cycle / genetics
  • Cluster Analysis
  • Computer Simulation
  • Gene Duplication*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / genetics*
  • Genes / genetics
  • Genes / physiology*
  • Genes, Duplicate / genetics
  • Linear Models
  • Markov Chains
  • Models, Genetic*
  • Reproducibility of Results
  • Sample Size
  • Sensitivity and Specificity
  • Sequence Alignment / methods
  • Sequence Analysis, DNA / methods*
  • Yeasts / cytology
  • Yeasts / genetics
  • Yeasts / growth & development