A comprehensive quality control workflow for paired tumor-normal NGS experiments

Bioinformatics. 2017 Jun 1;33(11):1721-1722. doi: 10.1093/bioinformatics/btx032.

Abstract

Summary: Quality control (QC) is an important part of all NGS data analysis stages. Many available tools calculate QC metrics from different analysis steps of single sample experiments (raw reads, mapped reads and variant lists). Multi-sample experiments, as sequencing of tumor-normal pairs, require additional QC metrics to ensure validity of results. These multi-sample QC metrics still lack standardization. We therefore suggest a new workflow for QC of DNA sequencing of tumor-normal pairs. With this workflow well-known single-sample QC metrics and additional metrics specific for tumor-normal pairs can be calculated. The segmentation into different tools offers a high flexibility and allows reuse for other purposes. All tools produce qcML, a generic XML format for QC of -omics experiments. qcML uses quality metrics defined in an ontology, which was adapted for NGS.

Availability and implementation: All QC tools are implemented in C ++ and run both under Linux and Windows. Plotting requires python 2.7 and matplotlib. The software is available under the 'GNU General Public License version 2' as part of the ngs-bits project: https://github.com/imgag/ngs-bits.

Contact: christopher.schroeder@med.uni-tuebingen.de.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • High-Throughput Nucleotide Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / standards
  • Humans
  • Neoplasms / genetics*
  • Quality Control*
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards
  • Software*
  • Workflow*