Theoretical guarantees for phylogeny inference from single-cell lineage tracing

Proc Natl Acad Sci U S A. 2023 Mar 21;120(12):e2203352120. doi: 10.1073/pnas.2203352120. Epub 2023 Mar 16.

Abstract

Lineage-tracing technologies based on Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9 (CRISPR-Cas9) genome editing have emerged as a powerful tool for investigating development in single-cell contexts, but exact reconstruction of the underlying clonal relationships in experiment is complicated by features of the data. These complications are functions of the experimental parameters in these systems, such as the Cas9 cutting rate, the diversity of indel outcomes, and the rate of missing data. In this paper, we develop two theoretically grounded algorithms for the reconstruction of the underlying single-cell phylogenetic tree as well as asymptotic bounds for the number of recording sites necessary for exact recapitulation of the ground truth phylogeny at high probability. In doing so, we explore the relationship between the problem difficulty and the experimental parameters, with implications for experimental design. Lastly, we provide simulations showing the empirical performance of these algorithms and showing that the trends in the asymptotic bounds hold empirically. Overall, this work provides a theoretical analysis of phylogenetic reconstruction in single-cell CRISPR-Cas9 lineage-tracing technologies.

Keywords: Crispr-Cas9; computational phylogenetics; single-cell lineage tracing.

MeSH terms

  • CRISPR-Associated Protein 9 / genetics
  • CRISPR-Cas Systems* / genetics
  • Cell Lineage / genetics
  • Gene Editing*
  • Phylogeny

Substances

  • CRISPR-Associated Protein 9