CoVaCS: a consensus variant calling system

BMC Genomics. 2018 Feb 5;19(1):120. doi: 10.1186/s12864-018-4508-1.

Abstract

Background: The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens.

Results: Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software.

Conclusions: CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .

Keywords: Consensus method; Graphical user interface; Variant annotation; Variant calling; Variant prioritization; Web server; Workflow.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Consensus Sequence*
  • Databases, Genetic
  • INDEL Mutation
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Software*
  • User-Computer Interface
  • Web Browser
  • Workflow