Benchmarking pipelines for subclonal deconvolution of bulk tumour sequencing data

Nat Commun. 2021 Nov 4;12(1):6396. doi: 10.1038/s41467-021-26698-7.

Abstract

Intratumour heterogeneity provides tumours with the ability to adapt and acquire treatment resistance. The development of more effective and personalised treatments for cancers, therefore, requires accurate characterisation of the clonal architecture of tumours, enabling evolutionary dynamics to be tracked. Many methods exist for achieving this from bulk tumour sequencing data, involving identifying mutations and performing subclonal deconvolution, but there is a lack of systematic benchmarking to inform researchers on which are most accurate, and how dataset characteristics impact performance. To address this, we use the most comprehensive tumour genome simulation tool available for such purposes to create 80 bulk tumour whole exome sequencing datasets of differing depths, tumour complexities, and purities, and use these to benchmark subclonal deconvolution pipelines. We conclude that i) tumour complexity does not impact accuracy, ii) increasing either purity or purity-corrected sequencing depth improves accuracy, and iii) the optimal pipeline consists of Mutect2, FACETS and PyClone-VI. We have made our benchmarking datasets publicly available for future use.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking / methods*
  • Exome / genetics
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Software*