Boolean networks using the chi-square test for inferring large-scale gene regulatory networks

BMC Bioinformatics. 2007 Feb 1:8:37. doi: 10.1186/1471-2105-8-37.

Abstract

Background: Boolean network (BN) modeling is a commonly used method for constructing gene regulatory networks from time series microarray data. However, its major drawback is that its computation time is very high or often impractical to construct large-scale gene networks. We propose a variable selection method that are not only reduces BN computation times significantly but also obtains optimal network constructions by using chi-square statistics for testing the independence in contingency tables.

Results: Both the computation time and accuracy of the network structures estimated by the proposed method are compared with those of the original BN methods on simulated and real yeast cell cycle microarray gene expression data sets. Our results reveal that the proposed chi-square testing (CST)-based BN method significantly improves the computation time, while its ability to identify all the true network mechanisms was effectively the same as that of full-search BN methods. The proposed BN algorithm is approximately 70.8 and 7.6 times faster than the original BN algorithm when the error sizes of the Best-Fit Extension problem are 0 and 1, respectively. Further, the false positive error rate of the proposed CST-based BN algorithm tends to be less than that of the original BN.

Conclusion: The CST-based BN method dramatically improves the computation time of the original BN algorithm. Therefore, it can efficiently infer large-scale gene regulatory network mechanisms.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Chi-Square Distribution
  • Cluster Analysis
  • Computer Simulation
  • Data Interpretation, Statistical
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / physiology*
  • Logistic Models
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Proteome / metabolism*
  • Signal Transduction / physiology*

Substances

  • Proteome