Multiple sequence alignment in parallel on a workstation cluster

Bioinformatics. 2004 May 1;20(7):1193-5. doi: 10.1093/bioinformatics/bth055. Epub 2004 Feb 5.

Abstract

Summary: Multiple sequence alignment is the NP-hard problem of aligning three or more DNA or amino acid sequences in an optimal way so as to match as many characters as possible from the set of sequences. The popular sequence alignment program ClustalW uses the classical method of approximating a sequence alignment, by first computing a distance matrix and then constructing a guide tree to show the evolutionary relationship of the sequences. We show that parallelizing the ClustalW algorithm can result in significant speedup. We used a cluster of workstations using C and message passing interface for our implementation. Experimental results show that speedup of over 5.5 on six processors is obtainable for most inputs.

Availability: The software is available upon request from the second author.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Computing Methodologies
  • Gene Expression Profiling / methods*
  • Local Area Networks
  • Microcomputers*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis / methods*