Experimental and bioinformatics considerations in cancer application of single cell genomics

Comput Struct Biotechnol J. 2020 Dec 23:19:343-354. doi: 10.1016/j.csbj.2020.12.021. eCollection 2021.

Abstract

Single cell genomics offers an unprecedented resolution to interrogate genetic heterogeneity in a patient's tumour at the intercellular level. However, the DNA yield per cell is insufficient for today's sequencing library preparation protocols. This necessitates DNA amplification which is a key source of experimental noise. We provide an evaluation of two protocols using micro-fluidics based amplification for whole exome sequencing, which is an experimental scenario commonly used in single cell genomics. The results highlight their respective biases and relative strengths in identification of single nucleotide variations. Towards this end, we introduce a workflow SoVaTSiC, which allows for quality evaluation and somatic variant identification of single cell data. As proof of concept, the framework was applied to study a lung adenocarcinoma tumour. The analysis provides insights into tumour phylogeny by identifying key mutational events in lung adenocarcinoma evolution. The consequence of this inference is supported by the histology of the tumour and demonstrates usefulness of the approach.

Keywords: ADO, Allelic dropout; CNV, Copy number variation; FP, False positives; GMM, Gaussian Mixture Model; Protocol aware bioinformatics; SNV, Single nucleotide variation; Single cell genomics; Single cell somatic variant caller; TP, True positives; WGA, Whole genome amplification; Whole genome amplification.