Pan-genome: setting a new standard for high-quality reference genomes

Yi Chuan. 2021 Nov 20;43(11):1023-1037. doi: 10.16288/j.yczz.21-214.

Abstract

With the release of high-quality reference genomes assembled by long reads from the third-generation sequencing technology, as well as extensive re-sequencing and population genetic analysis, researchers found that a single reference genome does not represent the diversity within a species. The missing sequences on the reference genome result in an incomplete population genetic polymorphism map. The emergence of pan-genome can well repair the deficiency of single reference genome, which include core genome (responsible for basic biological functions and the main phenotypic characteristics within a species) and the variable genome (related to the genetic diversity or biological characteristics). According to the core and variable genome proportion, the types of pan-genomes can be either open or closed. Here, we review the current exploring of pan-genome for a range of species, to discuss the characteristics of pan-genome in various biological groups. The pan-genome of mammals are more likely closed, while the pan-genomes of microbes, angiosperms, and some invertebrates are likely non-closed. It is possible to complete the reference genome and obtain complete variation information through the pan-genomic study, which will contribute to the study of molecular mechanism for genetic diversity and phenotypic evolution.

随着三代测序组装的高质量参考基因组的陆续发布,以及大规模重测序和群体遗传学分析的广泛进行,研究人员发现来自单一个体的参考基因组远不能涵盖整个物种的所有遗传序列,大量缺失序列导致群体遗传变异图谱不完整,而构建来自多个个体的泛基因组能很好地解决这一缺陷,其研究内容包括负责基本生物学功能及该物种主要表型特征的核心基因组以及与物种的遗传多样性和个体独特性相关的可变基因组。根据核心和可变基因组所占比例的不同,泛基因组存在开放型和闭合型两种类型。本文主要综述了细菌、真菌和动植物的泛基因组学研究进展,讨论了其在各生物类群中的特征,其中哺乳动物泛基因组是相对闭合的,而目前已知的微生物、被子植物和部分低等动物的泛基因组倾向于开放,通过泛基因组的构建可以完善现有参考基因组并获取整个物种的完整变异信息,将有助于深入研究遗传多样性和表型变异产生的分子机制。.

Keywords: core genome; pan-genome; presence and absence variations; variable genome.

Publication types

  • Review

MeSH terms

  • Genome* / genetics
  • Genomics*