SAGE: a comprehensive resource of genetic variants integrating South Asian whole genomes and exomes

Database (Oxford). 2018 Jan 1:2018:1-10. doi: 10.1093/database/bay080.

Abstract

South Asia is home to $\sim $20% of the world population and characterized by distinct ethnic, linguistic, cultural and genetic lineages. Only limited representative samples from the region have found its place in large population-scale international genome projects. The recent availability of genome scale data from multiple populations and datasets from South Asian countries in public domain motivated us to integrate the data into a comprehensive resource. In the present study, we have integrated a total of six datasets encompassing 1213 human exomes and genomes to create a compendium of 154 814 557 genetic variants and adding a total of 69 059 255 novel variants. The variants were systematically annotated using public resources and along with the allele frequencies are available as a browsable-online resource South Asian genomes and exomes. As a proof of principle application of the data and resource for genetic epidemiology, we have analyzed the pathogenic genetic variants causing retinitis pigmentosa. Our analysis reveals the genetic landscape of the disease and suggests subset of genetic variants to be highly prevalent in South Asia.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian People / genetics*
  • Databases, Genetic
  • Exome / genetics*
  • Gene Frequency
  • Genetic Variation*
  • Genome, Human*
  • Humans
  • Molecular Epidemiology
  • Molecular Sequence Annotation
  • Publications