GSTaxClassifier: a genomic signature based taxonomic classifier for metagenomic data analysis

Bioinformation. 2009 Aug 20;4(1):46-9. doi: 10.6026/97320630004046.

Abstract

GSTaxClassifier (Genomic Signature based Taxonomic Classifier) is a program for metagenomics analysis of shotgun DNA sequences. The program includes a simple but effective algorithm, a modification of the Bayesian method, to predict the most probable genomic origins of sequences at different taxonomical ranks, on the basis of genome databases;a function to generate genomic profiles of reference sequences with tri-, tetra-, penta-, and hexa-nucleotide motifs for setting a user-defined database; two different formats (tabular- and tree-based summaries) to display taxonomic predictions with improved analytical methods; and effective ways to retrieve, search, and summarize results by integrating the predictions into the NCBI tree-based taxonomic information.GSTaxClassifier takes input nucleotide sequences and using a modified Bayesian model evaluates the genomic signatures between metagenomic query sequences and reference genome databases. The simulation studies of a numerical data sets showed that GSTaxClassifier could serve as a useful program for metagenomics studies, which is freely available at http://helix2.biotech.ufl.edu:26878/metagenomics/.

Keywords: Bayesian method; Genomic signature; meta-genomics; taxonomy.