An efficient and tunable parameter to improve variant calling for whole genome and exome sequencing data

Genes Genomics. 2018 Jan;40(1):39-47. doi: 10.1007/s13258-017-0608-6. Epub 2017 Aug 29.

Abstract

Next generation sequencing (NGS) has traditionally been performed in various fields including agricultural to clinical and there are so many sequencing platforms available in order to obtain accurate and consistent results. However, these platforms showed amplification bias when facilitating variant calls in personal genomes. Here, we sequenced whole genomes and whole exomes from ten Korean individuals using Illumina and Ion Proton, respectively to find the vulnerability and accuracy of NGS platform in the GC rich/poor area. Overall, a total of 1013 Gb reads from Illumina and ~39.1 Gb reads from Ion Proton were analyzed using BWA-GATK variant calling pipeline. Furthermore, conjunction with the VQSR tool and detailed filtering strategies, we achieved high-quality variants. Finally, each of the ten variants from Illumina only, Ion Proton only, and intersection was selected for Sanger validation. The validation results revealed that Illumina platform showed higher accuracy than Ion Proton. The described filtering methods are advantageous for large population-based whole genome studies designed to identify common and rare variations associated with complex diseases.

Keywords: Illumina; Ion Proton; Variant calling; Whole exome sequencing; Whole genome sequencing.

MeSH terms

  • Base Sequence
  • Exome Sequencing / methods*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Polymorphism, Single Nucleotide
  • Republic of Korea
  • Sequence Analysis, DNA / methods*
  • Software
  • Whole Genome Sequencing / methods*