Population-scale genotyping of structural variation in the era of long-read sequencing

Comput Struct Biotechnol J. 2022 May 27:20:2639-2647. doi: 10.1016/j.csbj.2022.05.047. eCollection 2022.

Abstract

Population-scale studies of structural variation (SV) are growing rapidly worldwide with the development of long-read sequencing technology, yielding a considerable number of novel SVs and complete gap-closed genome assemblies. Herein, we highlight recent studies using a hybrid sequencing strategy and present the challenges toward large-scale genotyping for SVs due to the reference bias. Genotyping SVs at a population scale remains challenging, which severely impacts genotype-based population genetic studies or genome-wide association studies of complex diseases. We summarize academic efforts to improve genotype quality through linear or graph representations of reference and alternative alleles. Graph-based genotypers capable of integrating diverse genetic information are effectively applied to large and diverse cohorts, contributing to unbiased downstream analysis. Meanwhile, there is still an urgent need in this field for efficient tools to construct complex graphs and perform sequence-to-graph alignments.

Keywords: Genotyping; Long-read sequencing; Pan-genome; Structural variation.

Publication types

  • Review