Benchmarking pipelines for subclonal deconvolution of bulk tumour sequencing data

Georgette Tanner; David R Westhead; Alastair Droop; Lucy F Stead

doi:10.1038/s41467-021-26698-7

Benchmarking pipelines for subclonal deconvolution of bulk tumour sequencing data

Nat Commun. 2021 Nov 4;12(1):6396. doi: 10.1038/s41467-021-26698-7.

Authors

Georgette Tanner¹, David R Westhead², Alastair Droop³, Lucy F Stead⁴

Affiliations

¹ Leeds Institute of Medical Research, Faculty of Medicine and Health, University of Leeds, St James's University Hospital, Beckett Street, Leeds, West Yorkshire, LS9 7TF, UK.
² School of Molecular and Cellular Biology, University of Leeds, Leeds, West Yorkshire, LS2 9JT, UK.
³ Wellcome Sanger Institute, Hinxton, Saffron Walden, CB10 1RQ, UK.
⁴ Leeds Institute of Medical Research, Faculty of Medicine and Health, University of Leeds, St James's University Hospital, Beckett Street, Leeds, West Yorkshire, LS9 7TF, UK. l.f.stead@leeds.ac.uk.

Abstract

Intratumour heterogeneity provides tumours with the ability to adapt and acquire treatment resistance. The development of more effective and personalised treatments for cancers, therefore, requires accurate characterisation of the clonal architecture of tumours, enabling evolutionary dynamics to be tracked. Many methods exist for achieving this from bulk tumour sequencing data, involving identifying mutations and performing subclonal deconvolution, but there is a lack of systematic benchmarking to inform researchers on which are most accurate, and how dataset characteristics impact performance. To address this, we use the most comprehensive tumour genome simulation tool available for such purposes to create 80 bulk tumour whole exome sequencing datasets of differing depths, tumour complexities, and purities, and use these to benchmark subclonal deconvolution pipelines. We conclude that i) tumour complexity does not impact accuracy, ii) increasing either purity or purity-corrected sequencing depth improves accuracy, and iii) the optimal pipeline consists of Mutect2, FACETS and PyClone-VI. We have made our benchmarking datasets publicly available for future use.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Benchmarking / methods*
Exome / genetics
High-Throughput Nucleotide Sequencing / methods
Humans
Software*

Abstract

Publication types

MeSH terms

Grants and funding