Annotation of Variant Data from High-Throughput DNA Sequencing from Tumor Specimens: Filtering Strategies to Identify Driver Mutations

Methods Mol Biol. 2019:1908:49-60. doi: 10.1007/978-1-4939-9004-7_4.

Abstract

The use of next-generation sequencing technologies has enabled the analysis of a wide spectrum of somatic mutations in tumors. This analysis can be carried out using various strategies including the use of small panels of focused, clinically actionable genes, large panels of cancer-related genes, whole exomes, and the entire genome. One of the main goals in these analyses is to identify key mutations in these tumors that drive the oncogenic process. Depending on the gene, mutations can have altering effects, such as loss of function mutations in tumor suppressor genes, to mutations that activate genes such as kinases involved with cell cycle progression or proliferation. Once the sequencing process is complete, and the alignment of the large collection of reads to the reference genome and variant calling has been carried out, one is left with a large collection of variants. The challenge then becomes assigning where the variant resides in the genome with respect to coding regions, splice site regions, regulatory regions, and what potential functional effect these variants may have on the resulting protein. Other helpful information includes determining if the variant has been identified before, and if so, the tumor type associated with the variant. In addition, if the tumor profiling experiment is not conducted with a matched specimen representing the inherited genome, various tools are helpful to determine if the variant is likely to be an inherited polymorphism or a somatic event. In this chapter, we review the various tools available for annotating variants to assist in filtering down and prioritizing the hundreds to thousands of variants down to the key variants likely to be driver mutations and relevant to the tumor being profiled.

Keywords: Driver mutations; Functional predictions; Next-generation sequencing (NGS); Somatic variants.

MeSH terms

  • Computational Biology / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Mutation*
  • Neoplasms / genetics*
  • Polymorphism, Genetic
  • Sequence Analysis, DNA / methods*
  • Software*