Identification of candidate transcription factor binding sites in the cattle genome

Genomics Proteomics Bioinformatics. 2013 Jun;11(3):195-8. doi: 10.1016/j.gpb.2012.10.004. Epub 2013 Feb 1.

Abstract

A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach-using sequence conservation across cattle, human and dog-and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Binding Sites
  • Cattle / genetics*
  • Conserved Sequence
  • Dogs
  • Gene Expression Regulation
  • Genome*
  • Humans
  • Molecular Sequence Data
  • Phylogeny
  • Position-Specific Scoring Matrices
  • Promoter Regions, Genetic
  • Protein Binding
  • Software
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism

Substances

  • Transcription Factors