Construction and characterization of a rock-cluster-based EST analysis pipeline

Comput Biol Chem. 2006 Feb;30(1):81-6. doi: 10.1016/j.compbiolchem.2005.10.003.

Abstract

Open access to vast amount of expression sequence tags (ESTs) data in the public databases has provided a powerful platform for gene identification, gene expression studies and comparative/functional genomic studies. To facilitate management of large-scale EST data, high performance cluster and analysis softwares, especially parallel softwares, are fundamentally essential. We reported herein a convenient approach to construct a high performance computating (HPC) cluster based on popular Rocks and a perl-scripted analysis pipeline for EST pre-processing, clustering, assembling and annotation and any other desired analysis modules through parallel computing. We tested the system using different datasets on increasing nodes. Our present results showed that the cluster and pipeline accelerate the EST analysis without artificial interference.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods*
  • Databases, Genetic
  • Expressed Sequence Tags / chemistry*
  • Genome, Human
  • Humans
  • Software*