A study of transposable element-associated structural variations (TASVs) using a de novo-assembled Korean genome

Exp Mol Med. 2021 Apr;53(4):615-630. doi: 10.1038/s12276-021-00586-y. Epub 2021 Apr 8.

Abstract

Advances in next-generation sequencing (NGS) technology have made personal genome sequencing possible, and indeed, many individual human genomes have now been sequenced. Comparisons of these individual genomes have revealed substantial genomic differences between human populations as well as between individuals from closely related ethnic groups. Transposable elements (TEs) are known to be one of the major sources of these variations and act through various mechanisms, including de novo insertion, insertion-mediated deletion, and TE-TE recombination-mediated deletion. In this study, we carried out de novo whole-genome sequencing of one Korean individual (KPGP9) via multiple insert-size libraries. The de novo whole-genome assembly resulted in 31,305 scaffolds with a scaffold N50 size of 13.23 Mb. Furthermore, through computational data analysis and experimental verification, we revealed that 182 TE-associated structural variation (TASV) insertions and 89 TASV deletions contributed 64,232 bp in sequence gain and 82,772 bp in sequence loss, respectively, in the KPGP9 genome relative to the hg19 reference genome. We also verified structural differences associated with TASVs by comparative analysis with TASVs in recent genomes (AK1 and TCGA genomes) and reported their details. Here, we constructed a new Korean de novo whole-genome assembly and provide the first study, to our knowledge, focused on the identification of TASVs in an individual Korean genome. Our findings again highlight the role of TEs as a major driver of structural variations in human individual genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alu Elements
  • Computational Biology
  • DNA Transposable Elements*
  • Databases, Genetic
  • Genetic Variation*
  • Genetics, Population
  • Genome, Human*
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Republic of Korea
  • Sequence Analysis, DNA / methods

Substances

  • DNA Transposable Elements