Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast

BMC Bioinformatics. 2008 Sep 12:9:372. doi: 10.1186/1471-2105-9-372.

Abstract

Background: In analyzing the stability of DNA replication origins in Saccharomyces cerevisiae we faced the question whether one set of sequences is significantly enriched in the number and/or the quality of the matches of a particular position weight matrix relative to another set.

Results: We present SADMAMA, a computational solution to a address this problem. SADMAMA implements two types of statistical tests to answer this question: one type is based on simplified models, while the other relies on bootstrapping, and as such might be preferable to users who are averse to such models. The bootstrap approach incorporates a novel "site-protected" resampling procedure which solves a problem we identify with naive resampling.

Conclusion: SADMAMA's utility is demonstrated here by offering a plausible explanation to the differential ARS activity observed in our previous mcm1-1 mutant experiments 1, by suggesting the relevance of multiple weak ACS matches to efficient replication origin function in Saccharomyces cerevisiae, and by suggesting an explanation to the observed negative effect FKH2 has on chromatin silencing 2. SADMAMA is available for download from http://www.cs.cornell.edu/~keich/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Base Sequence
  • Binding Sites
  • Computer Simulation
  • DNA, Fungal / genetics*
  • Data Interpretation, Statistical
  • Genetic Variation / genetics
  • Models, Genetic
  • Molecular Sequence Data
  • Protein Binding
  • Replication Origin / genetics*
  • Saccharomyces cerevisiae / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Fungal