On the waiting time until coordinated mutations get fixed in regulatory sequences

J Theor Biol. 2021 Sep 7:524:110657. doi: 10.1016/j.jtbi.2021.110657. Epub 2021 Mar 4.

Abstract

In this paper we consider the time evolution of a population of size N with overlapping generations, in the vicinity of m genes. We assume that this population is subject to point mutations, genetic drift, and selection. More specifically, we analyze the statistical distribution of the waiting time Tm until the expression of these genes have changed for all individuals, when transcription factors recognize and attach to short DNA-sequences (binding sites) within regulatory sequences in the neighborhoods of the m genes. The evolutionary dynamics is described by a multitype Moran process, where each individual is assigned a m×L regulatory array that consists of regulatory sequences with L nucleotides for all m genes. We study how the waiting time distribution depends on the number of genes, the mutation rate, the length of the binding sites, the length of the regulatory sequences, and the way in which the targeted binding sites are coordinated for different genes in terms of selection coefficients. These selection coefficients depend on how many binding sites have appeared so far, and possibly on their order of appearance. We also allow for back mutations, whereby some acquired binding sites may be lost over time. It is further assumed that the mutation rate is small enough to warrant a fixed state population, so that all individuals have the same regulatory array, at any given time point, until the next successful mutation arrives in some individual and spreads to the rest of the population. We further incorporate stochastic tunneling, whereby successful mutations get mutated before their fixation. A crucial part of our approach is to divide the huge state space of regulatory arrays into a small number of components, assuming that the array component varies as a Markov process over time. This implies that Tm is the time until this Markov process hits an absorbing state, with a phase-type distribution. A number of interesting results can be derived from our general setup, for instance that the expected waiting time increases exponentially with m, for a selectively neutral model, when back-mutations are possible.

Keywords: Binding site; Continuous time Markov process; Coordinated mutations; Moran model; Phase-type distribution; Regulatory sequence; Waiting time.

MeSH terms

  • Binding Sites / genetics
  • Evolution, Molecular
  • Genetic Drift*
  • Humans
  • Models, Genetic*
  • Mutation
  • Selection, Genetic
  • Time Factors
  • Transcription Factors / genetics

Substances

  • Transcription Factors