Random phenotypic variation of yeast (Saccharomyces cerevisiae) single-gene knockouts fits a double pareto-lognormal distribution

PLoS One. 2012;7(11):e48964. doi: 10.1371/journal.pone.0048964. Epub 2012 Nov 6.

Abstract

Background: Distributed robustness is thought to influence the buffering of random phenotypic variation through the scale-free topology of gene regulatory, metabolic, and protein-protein interaction networks. If this hypothesis is true, then the phenotypic response to the perturbation of particular nodes in such a network should be proportional to the number of links those nodes make with neighboring nodes. This suggests a probability distribution approximating an inverse power-law of random phenotypic variation. Zero phenotypic variation, however, is impossible, because random molecular and cellular processes are essential to normal development. Consequently, a more realistic distribution should have a y-intercept close to zero in the lower tail, a mode greater than zero, and a long (fat) upper tail. The double Pareto-lognormal (DPLN) distribution is an ideal candidate distribution. It consists of a mixture of a lognormal body and upper and lower power-law tails.

Objective and methods: If our assumptions are true, the DPLN distribution should provide a better fit to random phenotypic variation in a large series of single-gene knockout lines than other skewed or symmetrical distributions. We fit a large published data set of single-gene knockout lines in Saccharomyces cerevisiae to seven different probability distributions: DPLN, right Pareto-lognormal (RPLN), left Pareto-lognormal (LPLN), normal, lognormal, exponential, and Pareto. The best model was judged by the Akaike Information Criterion (AIC).

Results: Phenotypic variation among gene knockouts in S. cerevisiae fits a double Pareto-lognormal (DPLN) distribution better than any of the alternative distributions, including the right Pareto-lognormal and lognormal distributions.

Conclusions and significance: A DPLN distribution is consistent with the hypothesis that developmental stability is mediated, in part, by distributed robustness, the resilience of gene regulatory, metabolic, and protein-protein interaction networks. Alternatively, multiplicative cell growth, and the mixing of lognormal distributions having different variances, may generate a DPLN distribution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Genetic
  • Gene Knockout Techniques*
  • Genes, Fungal
  • Phenotype
  • Probability
  • Saccharomyces cerevisiae / cytology*
  • Saccharomyces cerevisiae / genetics*
  • Statistical Distributions*

Grants and funding

This research was supported by the School of Mathematical and Natural Sciences, Berry College, Mount Berry, Georgia, United States of America. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.