Accurate somatic variant detection using weakly supervised deep learning

Nat Commun. 2022 Jul 22;13(1):4248. doi: 10.1038/s41467-022-31765-8.

Abstract

Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Benchmarking
  • Deep Learning*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Neoplasms* / genetics