CPAP: Cancer Panel Analysis Pipeline

Hum Mutat. 2013 Oct;34(10):1340-6. doi: 10.1002/humu.22386. Epub 2013 Aug 13.

Abstract

Targeted sequencing using next-generation sequencing technologies is currently being rapidly adopted for clinical sequencing and cancer marker tests. However, no existing bioinformatics tool is available for the analysis and visualization of multiple targeted sequencing datasets. In the present study, we use cancer panel targeted sequencing datasets generated by the Life Technologies Ion Personal Genome Machine Sequencer as an example to illustrate how to develop an automated pipeline for the comparative analyses of multiple datasets. Cancer Panel Analysis Pipeline (CPAP) uses standard output files from variant calling software to generate a distribution map of SNPs among all of the samples in a circular diagram generated by Circos. The diagram is hyperlinked to a dynamic HTML table that allows the users to identify target SNPs by using different filters. CPAP also integrates additional information about the identified SNPs by linking to an integrated SQL database compiled from SNP-related databases, including dbSNP, 1000 Genomes Project, COSMIC, and dbNSFP. CPAP only takes 17 min to complete a comparative analysis of 500 datasets. CPAP not only provides an automated platform for the analysis of multiple cancer panel datasets but can also serve as a model for any customized targeted sequencing project.

Keywords: Circos; cancer panel; ion torrent; target sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Humans
  • Neoplasms / genetics*
  • Software*
  • Web Browser

Substances

  • Biomarkers, Tumor