PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform

Nucleic Acids Res. 2023 Jan 6;51(D1):D1109-D1116. doi: 10.1093/nar/gkac905.

Abstract

Structural variations (SVs) play important roles in human evolution and diseases, but there is a lack of data resources concerning representative samples, especially for East Asians. Taking advantage of both next-generation sequencing and third-generation sequencing data at the whole-genome level, we developed the database PGG.SV to provide a practical platform for both regionally and globally representative structural variants. In its current version, PGG.SV archives 584 277 SVs obtained from whole-genome sequencing data of 6048 samples, including 1030 long-read sequencing genomes representing 177 global populations. PGG.SV provides (i) high-quality SVs with fine-scale and precise genomic locations in both GRCh37 and GRCh38, covering underrepresented SVs in existing sequencing and microarray data; (ii) hierarchical estimation of SV prevalence in geographical populations; (iii) informative annotations of SV-related genes, potential functions and clinical effects; (iv) an analysis platform to facilitate SV-based case-control association studies and (v) various visualization tools for understanding the SV structures in the human genome. Taken together, PGG.SV provides a user-friendly online interface, easy-to-use analysis tools and a detailed presentation of results. PGG.SV is freely accessible via https://www.biosino.org/pggsv.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Genetic
  • Genome, Human
  • Genomic Structural Variation
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Sequence Analysis, DNA / methods
  • Whole Genome Sequencing