Statistical estimates of multiple transcription factors binding in the model plant genomes based on ChIP-seq data

J Integr Bioinform. 2021 Dec 21;19(1):20200036. doi: 10.1515/jib-2020-0036.

Abstract

The development of high-throughput genomic sequencing coupled with chromatin immunoprecipitation technologies allows studying the binding sites of the protein transcription factors (TF) in the genome scale. The growth of data volume on the experimentally determined binding sites raises qualitatively new problems for the analysis of gene expression regulation, prediction of transcription factors target genes, and regulatory gene networks reconstruction. Genome regulation remains an insufficiently studied though plants have complex molecular regulatory mechanisms of gene expression and response to environmental stresses. It is important to develop new software tools for the analysis of the TF binding sites location and their clustering in the plant genomes, visualization, and the following statistical estimates. This study presents application of the analysis of multiple TF binding profiles in three evolutionarily distant model plant organisms. The construction and analysis of non-random ChIP-seq binding clusters of the different TFs in mammalian embryonic stem cells were discussed earlier using similar bioinformatics approaches. Such clusters of TF binding sites may indicate the gene regulatory regions, enhancers and gene transcription regulatory hubs. It can be used for analysis of the gene promoters as well as a background for transcription networks reconstruction. We discuss the statistical estimates of the TF binding sites clusters in the model plant genomes. The distributions of the number of different TFs per binding cluster follow same power law distribution for all the genomes studied. The binding clusters in Arabidopsis thaliana genome were discussed here in detail.

Keywords: ChIP-seq; gene expression; plant genomes; regulatory gene networks; transcription factor binding sites; transcription regulation.

MeSH terms

  • Animals
  • Binding Sites / genetics
  • Chromatin Immunoprecipitation
  • Chromatin Immunoprecipitation Sequencing*
  • Genome, Plant
  • Mammals / genetics
  • Mammals / metabolism
  • Transcription Factors* / genetics
  • Transcription Factors* / metabolism

Substances

  • Transcription Factors