Computational models reveal genotype-phenotype associations in Saccharomyces cerevisiae

Yeast. 2014 Jul;31(7):265-77. doi: 10.1002/yea.3016. Epub 2014 May 26.

Abstract

Genome sequencing is essential to understand individual variation and to study the mechanisms that explain relations between genotype and phenotype. The accumulated knowledge from large-scale genome sequencing projects of Saccharomyces cerevisiae isolates is being used to study the mechanisms that explain such relations. Our objective was to undertake genetic characterization of 172 S. cerevisiae strains from different geographical origins and technological groups, using 11 polymorphic microsatellites, and computationally relate these data with the results of 30 phenotypic tests. Genetic characterization revealed 280 alleles, with the microsatellite ScAAT1 contributing most to intrastrain variability, together with alleles 20, 9 and 16 from the microsatellites ScAAT4, ScAAT5 and ScAAT6. These microsatellite allelic profiles are characteristic for both the phenotype and origin of yeast strains. We confirm the strength of these associations by construction and cross-validation of computational models that can predict the technological application and origin of a strain from the microsatellite allelic profile. Associations between microsatellites and specific phenotypes were scored using information gain ratios, and significant findings were confirmed by permutation tests and estimation of false discovery rates. The phenotypes associated with higher number of alleles were the capacity to resist to sulphur dioxide (tested by the capacity to grow in the presence of potassium bisulphite) and the presence of galactosidase activity. Our study demonstrates the utility of computational modelling to estimate a strain technological group and phenotype from microsatellite allelic combinations as tools for preliminary yeast strain selection.

Keywords: Saccharomyces cerevisiae; data mining; microsatellite; nearest-neighbour classifier; phenotypic characterization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Computer Simulation
  • DNA, Fungal / genetics*
  • Genetic Variation*
  • Genotype
  • Microsatellite Repeats / genetics*
  • Models, Genetic*
  • Phenotype
  • Principal Component Analysis
  • Saccharomyces cerevisiae / genetics*

Substances

  • DNA, Fungal