Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures

Bioinformatics. 2010 May 15;26(10):1368-9. doi: 10.1093/bioinformatics/btq135. Epub 2010 Mar 26.

Abstract

Summary: Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of multi-core technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications. In this article, we introduce an implementation that accelerates the distance matrix computation on x86 and Cell Broadband Engine, a homogeneous and heterogeneous multi-core system, respectively. By taking advantage of multiple processors as well as Single Instruction Multiple Data vectorization, we were able to achieve speed-ups of two orders of magnitude compared to the publicly available implementation utilized in ClustalW.

Availability and implementation: Source codes in C are publicly available at https://sourceforge.net/projects/distmatcomp/

Contact: adri0004@ntu.edu.sg

MeSH terms

  • Computational Biology
  • Pattern Recognition, Automated
  • Sequence Alignment / methods*
  • Software*