A benchmark of structural variation detection by long reads through a realistic simulated model

Genome Biol. 2021 Dec 15;22(1):342. doi: 10.1186/s13059-021-02551-4.

Abstract

Accurate simulations of structural variation distributions and sequencing data are crucial for the development and benchmarking of new tools. We develop Sim-it, a straightforward tool for the simulation of both structural variation and long-read data. These simulations from Sim-it reveal the strengths and weaknesses for current available structural variation callers and long-read sequencing platforms. With these findings, we develop a new method (combiSV) that can combine the results from structural variation callers into a superior call set with increased recall and precision, which is also observed for the latest structural variation benchmark set developed by the GIAB Consortium.

Keywords: Benchmark; Long-read sequencing; Simulated model; Structural variation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Benchmarking*
  • Computer Simulation*
  • Genome, Human*
  • Genomics
  • Humans
  • Nanopore Sequencing
  • Sequence Analysis*
  • Software