Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast

Uri Keich; Hong Gao; Jeffrey S Garretson; Anand Bhaskar; Ivan Liachko; Justin Donato; Bik K Tye

doi:10.1186/1471-2105-9-372

Computational detection of significant variation in binding affinity across two sets of sequences with application to the analysis of replication origins in yeast

BMC Bioinformatics. 2008 Sep 12:9:372. doi: 10.1186/1471-2105-9-372.

Authors

Uri Keich¹, Hong Gao, Jeffrey S Garretson, Anand Bhaskar, Ivan Liachko, Justin Donato, Bik K Tye

Affiliation

¹ Department of Computer Science, Cornell University, Ithaca, NY 14853, USA. keich@cs.cornell.edu

Abstract

Background: In analyzing the stability of DNA replication origins in Saccharomyces cerevisiae we faced the question whether one set of sequences is significantly enriched in the number and/or the quality of the matches of a particular position weight matrix relative to another set.

Results: We present SADMAMA, a computational solution to a address this problem. SADMAMA implements two types of statistical tests to answer this question: one type is based on simplified models, while the other relies on bootstrapping, and as such might be preferable to users who are averse to such models. The bootstrap approach incorporates a novel "site-protected" resampling procedure which solves a problem we identify with naive resampling.

Conclusion: SADMAMA's utility is demonstrated here by offering a plausible explanation to the differential ARS activity observed in our previous mcm1-1 mutant experiments 1, by suggesting the relevance of multiple weak ACS matches to efficient replication origin function in Saccharomyces cerevisiae, and by suggesting an explanation to the observed negative effect FKH2 has on chromatin silencing 2. SADMAMA is available for download from http://www.cs.cornell.edu/~keich/.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Base Sequence
Binding Sites
Computer Simulation
DNA, Fungal / genetics*
Data Interpretation, Statistical
Genetic Variation / genetics
Models, Genetic
Molecular Sequence Data
Protein Binding
Replication Origin / genetics*
Saccharomyces cerevisiae / genetics*
Sequence Alignment / methods*
Sequence Analysis, DNA / methods*

Substances

DNA, Fungal

Abstract

Publication types

MeSH terms

Substances

Grants and funding