MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

Bioinformatics. 2016 Dec 15;32(24):3826-3828. doi: 10.1093/bioinformatics/btw558. Epub 2016 Sep 16.

Abstract

MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively.

Availability and implementation: Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at http://msaprobs.sourceforge.net CONTACT: jgonzalezd@udc.esSupplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Markov Chains
  • Proteins*
  • Sequence Alignment / methods*
  • Software

Substances

  • Proteins