Neural network and kinetic modelling of human genome replication reveal replication origin locations and strengths

PLoS Comput Biol. 2023 May 30;19(5):e1011138. doi: 10.1371/journal.pcbi.1011138. eCollection 2023 May.

Abstract

In human and other metazoans, the determinants of replication origin location and strength are still elusive. Origins are licensed in G1 phase and fired in S phase of the cell cycle, respectively. It is debated which of these two temporally separate steps determines origin efficiency. Experiments can independently profile mean replication timing (MRT) and replication fork directionality (RFD) genome-wide. Such profiles contain information on multiple origins' properties and on fork speed. Due to possible origin inactivation by passive replication, however, observed and intrinsic origin efficiencies can markedly differ. Thus, there is a need for methods to infer intrinsic from observed origin efficiency, which is context-dependent. Here, we show that MRT and RFD data are highly consistent with each other but contain information at different spatial scales. Using neural networks, we infer an origin licensing landscape that, when inserted in an appropriate simulation framework, jointly predicts MRT and RFD data with unprecedented precision and underlies the importance of dispersive origin firing. We furthermore uncover an analytical formula that predicts intrinsic from observed origin efficiency combined with MRT data. Comparison of inferred intrinsic origin efficiencies with experimental profiles of licensed origins (ORC, MCM) and actual initiation events (Bubble-seq, SNS-seq, OK-seq, ORM) show that intrinsic origin efficiency is not solely determined by licensing efficiency. Thus, human replication origin efficiency is set at both the origin licensing and firing steps.

MeSH terms

  • Chromosomes
  • DNA Replication* / genetics
  • Humans
  • Neural Networks, Computer
  • Replication Origin* / genetics
  • Virus Replication

Grants and funding

This work was supported by the Agence Nationale de la Recherche (ANR-18-CE45-0002 and ANR-19-CE12-0028) and the Cancéropôle Ile- de-France and the INCa (PL-BIO16-302). OH was also supported by the Ligue Nationale Contre le Cancer (Comité de Paris; RS19/75-75), the Association pour la Recherche sur le Cancer (PJA 20171206387) and the Fondation pour la Recherche Médicale (FRM EQU202203014910). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Hadi Kabalane and Jeremy Barbier received a salary from the ANR.