Detection of Somatic Mutations in Exome Sequencing of Tumor-only Samples

Sci Rep. 2017 Nov 21;7(1):15959. doi: 10.1038/s41598-017-14896-7.

Abstract

Due to lack of normal samples in clinical diagnosis and to reduce costs, detection of small-scale mutations from tumor-only samples is required but remains relatively unexplored. We developed an algorithm (GATKcan) augmenting GATK with two statistics and machine learning to detect mutations in cancer. The averaged performance of GATKcan in ten experiments outperformed GATK in detecting mutations of randomly sampled 231 from 241 TCGA endometrial tumors (EC). In external validations, GATKcan outperformed GATK in TCGA breast cancer (BC), ovarian cancer (OC) and melanoma tumors, in terms of Matthews correlation coefficient (MCC) and precision, where MCC takes both sensitivity and specificity into account. Further, GATKcan reduced high fractions of false positives detected by GATK. In mutation detection of somatic variants, classified commonly by VarScan 2 and MuTect from the called variants in BC, OC and melanoma, ranked by adjusted MCC (adjusted precision) GATKcan was the top 1, followed by MuTect, VarScan 2 and GATK. Importantly, GATKcan enables detection of mutations when alternate alleles exist in normal samples. These results suggest that GATKcan trained by a cancer is able to detect mutations in future patients with the same type of cancer and is likely applicable to other cancers with similar mutations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Exome Sequencing*
  • Humans
  • Mutation / genetics*
  • Neoplasms / genetics*
  • Reproducibility of Results