TOSCA: an automated Tumor Only Somatic CAlling workflow for somatic mutation detection without matched normal samples

Bioinform Adv. 2022 Sep 26;2(1):vbac070. doi: 10.1093/bioadv/vbac070. eCollection 2022.

Abstract

Motivation: Accurate classification of somatic variants in a tumor sample is often accomplished by utilizing a paired normal tissue sample from the same patient to enable the separation of private germline mutations from somatic variants. However, a paired normal sample is not always available, making a reliable somatic variant calling more challenging. In silico screening of variants against public or private databases and other filtering approaches are often used in absence of a paired normal sample. Nevertheless, difficulties in performing a tumor-only calling with sufficient accuracy and lack of open-source software have limited their applications in clinical research.

Results: To address these limitations, we developed TOSCA, the first automated tumor-only somatic calling workflow in whole-exome sequencing and targeted panel sequencing data which performs an end-to-end analysis from raw read files, via quality checks, alignment and variant calling to functional annotation, databases filtering, tumor purity and ploidy estimation and variant classification. Application of our workflow to tumor-only data provides estimates of somatic and germline variants that are consistent with results from paired analyses.

Availability and implementation: TOSCA is a Snakemake-based workflow and freely available at https://github.com/mdelcorvo/TOSCA.

Supplementary information: Supplementary data are available at Bioinformatics Advances online.