Critical length in long-read resequencing

NAR Genom Bioinform. 2020 Jan 13;2(1):lqz027. doi: 10.1093/nargab/lqz027. eCollection 2020 Mar.

Abstract

Long-read sequencing has substantial advantages for structural variant discovery and phasing of variants compared to short-read technologies, but the required and optimal read length has not been assessed. In this work, we used long reads simulated from human genomes and evaluated structural variant discovery and variant phasing using current best practice bioinformatics methods. We determined that optimal discovery of structural variants from human genomes can be obtained with reads of minimally 20 kb. Haplotyping variants across genes only reaches its optimum from reads of 100 kb. These findings are important for the design of future long-read sequencing projects.