CONIPHER: a computational framework for scalable phylogenetic reconstruction with error correction

Nat Protoc. 2024 Jan;19(1):159-183. doi: 10.1038/s41596-023-00913-9. Epub 2023 Nov 28.

Abstract

Intratumor heterogeneity provides the fuel for the evolution and selection of subclonal tumor cell populations. However, accurate inference of tumor subclonal architecture and reconstruction of tumor evolutionary histories from bulk DNA sequencing data remains challenging. Frequently, sequencing and alignment artifacts are not fully filtered out from cancer somatic mutations, and errors in the identification of copy number alterations or complex evolutionary events (e.g., mutation losses) affect the estimated cellular prevalence of mutations. Together, such errors propagate into the analysis of mutation clustering and phylogenetic reconstruction. In this Protocol, we present a new computational framework, CONIPHER (COrrecting Noise In PHylogenetic Evaluation and Reconstruction), that accurately infers subclonal structure and phylogenetic relationships from multisample tumor sequencing, accounting for both copy number alterations and mutation errors. CONIPHER has been used to reconstruct subclonal architecture and tumor phylogeny from multisample tumors with high-depth whole-exome sequencing from the TRACERx421 dataset, as well as matched primary-metastatic cases. CONIPHER outperforms similar methods on simulated datasets, and in particular scales to a large number of tumor samples and clones, while completing in under 1.5 h on average. CONIPHER enables automated phylogenetic analysis that can be effectively applied to large sequencing datasets generated with different technologies. CONIPHER can be run with a basic knowledge of bioinformatics and R and bash scripting languages.

Publication types

  • Review

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Humans
  • Mutation
  • Neoplasms* / genetics
  • Neoplasms* / pathology
  • Phylogeny
  • Sequence Analysis, DNA