bPeaks: a bioinformatics tool to detect transcription factor binding sites from ChIPseq data in yeasts and other organisms with small genomes

Yeast. 2014 Oct;31(10):375-91. doi: 10.1002/yea.3031. Epub 2014 Jul 28.

Abstract

Peak calling is a critical step in ChIPseq data analysis. Choosing the correct algorithm as well as optimized parameters for a specific biological system is an essential task. In this article, we present an original peak-calling method (bPeaks) specifically designed to detect transcription factor (TF) binding sites in small eukaryotic genomes, such as in yeasts. As TF interactions with DNA are strong and generate high binding signals, bPeaks uses simple parameters to compare the sequences (reads) obtained from the immunoprecipitation (IP) with those from the control DNA (input). Because yeasts have small genomes (<20 Mb), our program has the advantage of using ChIPseq information at the single nucleotide level and can explore, in a reasonable computational time, results obtained with different sets of parameter values. Graphical outputs and text files are provided to rapidly assess the relevance of the detected peaks. Taking advantage of the simple promoter structure in yeasts, additional functions were implemented in bPeaks to automatically assign the peaks to promoter regions and retrieve peak coordinates on the DNA sequence for further predictions of regulatory motifs, enriched in the list of peaks. Applications of the bPeaks program to three different ChIPseq datasets from Saccharomyces cerevisiae, Candida albicans and Candida glabrata are presented. Each time, bPeaks allowed us to correctly predict the DNA binding sequence of the studied TF and provided relevant lists of peaks. The bioinformatics tool bPeaks is freely distributed to academic users. Supplementary data, together with detailed tutorials, are available online: http://bpeaks.gene-networks.net.

Keywords: ChIPseq; bioinformatics; peak-calling; regulatory motifs; transcription factors; yeasts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Candida / genetics*
  • Candida albicans / genetics
  • Computational Biology / methods*
  • Fungal Proteins / genetics
  • Fungal Proteins / metabolism
  • Genome, Fungal / genetics*
  • High-Throughput Nucleotide Sequencing
  • Oligonucleotide Array Sequence Analysis
  • Protein Binding
  • Saccharomyces cerevisiae / genetics*
  • Sequence Analysis, DNA
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism

Substances

  • Fungal Proteins
  • Transcription Factors