Detection of somatic structural variants from short-read next-generation sequencing data

Brief Bioinform. 2021 May 20;22(3):bbaa056. doi: 10.1093/bib/bbaa056.

Abstract

Somatic structural variants (SVs), which are variants that typically impact >50 nucleotides, play a significant role in cancer development and evolution but are notoriously more difficult to detect than small variants from short-read next-generation sequencing (NGS) data. This is due to a combination of challenges attributed to the purity of tumour samples, tumour heterogeneity, limitations of short-read information from NGS and sequence alignment ambiguities. In spite of active development of SV detection tools (callers) over the past few years, each method has inherent advantages and limitations. In this review, we highlight some of the important factors affecting somatic SV detection and compared the performance of seven commonly used SV callers. In particular, we focus on the extent of change in sensitivity and precision for detecting different SV types and size ranges from samples with differing variant allele frequencies and sequencing depths of coverage. We highlight the reasons for why some SV callers perform well in some settings but not others, allowing our evaluation findings to be extended beyond the seven SV callers examined in this paper. As the importance of large SVs become increasingly recognized in cancer genomics, this paper provides a timely review on some of the most impactful factors influencing somatic SV detection that should be considered when choosing SV callers.

Keywords: cancer genomic; next-generation sequencing; structural variant; variant caller.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Gene Frequency
  • Genetic Variation
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasms / genetics*
  • Neoplasms / pathology
  • Sequence Analysis, DNA / methods