A database of 5305 healthy Korean individuals reveals genetic and clinical implications for an East Asian population

Exp Mol Med. 2022 Nov;54(11):1862-1871. doi: 10.1038/s12276-022-00871-4. Epub 2022 Nov 2.

Abstract

Despite substantial advances in disease genetics, studies to date have largely focused on individuals of European descent. This limits further discoveries of novel functional genetic variants in other ethnic groups. To alleviate the paucity of East Asian population genome resources, we established the Korean Variant Archive 2 (KOVA 2), which is composed of 1896 whole-genome sequences and 3409 whole-exome sequences from healthy individuals of Korean ethnicity. This is the largest genome database from the ethnic Korean population to date, surpassing the 1909 Korean individuals deposited in gnomAD. The variants in KOVA 2 displayed all the known genetic features of those from previous genome databases, and we compiled data from Korean-specific runs of homozygosity, positively selected intervals, and structural variants. In doing so, we found loci, such as the loci of ADH1A/1B and UHRF1BP1, that are strongly selected in the Korean population relative to other East Asian populations. Our analysis of allele ages revealed a correlation between variant functionality and evolutionary age. The data can be browsed and downloaded from a public website ( https://www.kobic.re.kr/kova/ ). We anticipate that KOVA 2 will serve as a valuable resource for genetic studies involving East Asian populations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian People* / genetics
  • Exome*
  • Humans
  • Polymorphism, Single Nucleotide
  • Republic of Korea