tHapMix: simulating tumour samples through haplotype mixtures

Bioinformatics. 2017 Jan 15;33(2):280-282. doi: 10.1093/bioinformatics/btw589. Epub 2016 Sep 7.

Abstract

Motivation: Large-scale rearrangements and copy number changes combined with different modes of clonal evolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable variant calling tools and create well-calibrated benchmarks.

Results: We developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools.

Availability and implementation: tHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/tHapMix CONTACT: sivakhno@illumina.comSupplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Computer Simulation
  • DNA Copy Number Variations*
  • DNA, Neoplasm
  • Genome
  • Genomics / methods*
  • Haplotypes*
  • Humans
  • Neoplasms / genetics*
  • Ploidies*
  • Software*

Substances

  • DNA, Neoplasm