Comparative analysis of 7 short-read sequencing platforms using the Korean Reference Genome: MGI and Illumina sequencing benchmark for whole-genome sequencing

Gigascience. 2021 Mar 12;10(3):giab014. doi: 10.1093/gigascience/giab014.

Abstract

Background: DNBSEQ-T7 is a new whole-genome sequencer developed by Complete Genomics and MGI using DNA nanoball and combinatorial probe anchor synthesis technologies to generate short reads at a very large scale-up to 60 human genomes per day. However, it has not been objectively and systematically compared against Illumina short-read sequencers.

Findings: By using the same KOREF sample, the Korean Reference Genome, we have compared 7 sequencing platforms including BGISEQ-500, DNBSEQ-T7, HiSeq2000, HiSeq2500, HiSeq4000, HiSeqX10, and NovaSeq6000. We measured sequencing quality by comparing sequencing statistics (base quality, duplication rate, and random error rate), mapping statistics (mapping rate, depth distribution, and percent GC coverage), and variant statistics (transition/transversion ratio, dbSNP annotation rate, and concordance rate with single-nucleotide polymorphism [SNP] genotyping chip) across the 7 sequencing platforms. We found that MGI platforms showed a higher concordance rate for SNP genotyping than HiSeq2000 and HiSeq4000. The similarity matrix of variant calls confirmed that the 2 MGI platforms have the most similar characteristics to the HiSeq2500 platform.

Conclusions: Overall, MGI and Illumina sequencing platforms showed comparable levels of sequencing quality, uniformity of coverage, percent GC coverage, and variant accuracy; thus we conclude that the MGI platforms can be used for a wide range of genomics research fields at a lower cost than the Illumina platforms.

Keywords: DNBSEQ-T7; sequencing platform comparison; whole-genome sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking*
  • Genome, Human
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Republic of Korea
  • Sequence Analysis, DNA
  • Whole Genome Sequencing