TarSeqQC: Quality control on targeted sequencing experiments in R

Hum Mutat. 2017 May;38(5):494-502. doi: 10.1002/humu.23204. Epub 2017 Mar 21.

Abstract

Targeted sequencing (TS) is growing as a screening methodology used in research and medical genetics to identify genomic alterations causing human diseases. In general, a list of possible genomic variants is derived from mapped reads through a variant calling step. This processing step is usually based on variant coverage, although it may be affected by several factors. Therefore, undercovered relevant clinical variants may not be reported, affecting pathology diagnosis or treatment. Thus, a prior quality control of the experiment is critical to determine variant detection accuracy and to avoid erroneous medical conclusions. There are several quality control tools, but they are focused on issues related to whole-genome sequencing. However, in TS, quality control should assess experiment, gene, and genomic region performances based on achieved coverages. Here, we propose TarSeqQC R package for quality control in TS experiments. The tool is freely available at Bioconductor repository. TarSeqQC was used to analyze two datasets; low-performance primer pools and features were detected, enhancing the quality of experiment results. Read count profiles were also explored, showing TarSeqQC's effectiveness as an exploration tool. Our proposal may be a valuable bioinformatic tool for routinely TS experiments in both research and medical genetics.

Keywords: Cancer panel; R package; experiment performance; medical genetics; quality control; targeted sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computational Biology / standards
  • Datasets as Topic
  • Genomics / methods*
  • Genomics / standards
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Neoplasms / genetics
  • Quality Control
  • Reproducibility of Results
  • Software* / standards
  • User-Computer Interface