LinkedSV for detection of mosaic structural variants from linked-read exome and genome sequencing data

Nat Commun. 2019 Dec 6;10(1):5585. doi: 10.1038/s41467-019-13397-7.

Abstract

Linked-read sequencing provides long-range information on short-read sequencing data by barcoding reads originating from the same DNA molecule, and can improve detection and breakpoint identification for structural variants (SVs). Here we present LinkedSV for SV detection on linked-read sequencing data. LinkedSV considers barcode overlapping and enriched fragment endpoints as signals to detect large SVs, while it leverages read depth, paired-end signals and local assembly to detect small SVs. Benchmarking studies demonstrate that LinkedSV outperforms existing tools, especially on exome data and on somatic SVs with low variant allele frequencies. We demonstrate clinical cases where LinkedSV identifies disease-causal SVs from linked-read exome sequencing data missed by conventional exome sequencing, and show examples where LinkedSV identifies SVs missed by high-coverage long-read sequencing. In summary, LinkedSV can detect SVs missed by conventional short-read and long-read sequencing approaches, and may resolve negative cases from clinical genome/exome sequencing studies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Sequence*
  • Chromosome Breakpoints
  • DNA Mutational Analysis / methods*
  • Exome*
  • Genome / genetics
  • Genome, Human
  • Genomic Structural Variation / genetics*
  • Humans
  • Models, Genetic
  • Neurofibromin 1 / genetics
  • Sequence Analysis, DNA
  • Sequence Deletion*
  • Software

Substances

  • NF1 protein, human
  • Neurofibromin 1