Fast and Accurate Multiple Sequence Alignment with MSAProbs-MPI

Methods Mol Biol. 2021:2231:39-47. doi: 10.1007/978-1-0716-1036-7_3.

Abstract

Multiple sequence alignment (MSA) is a central step in many bioinformatics and computational biology analyses. Although there exist many methods to perform MSA, most of them fail when dealing with large datasets due to their high computational cost. MSAProbs-MPI is a publicly available tool ( http://msaprobs.sourceforge.net ) that provides highly accurate results in relatively short runtime thanks to exploiting the hardware resources of multicore clusters. In this chapter, I explain the statistical and biological concepts employed in MSAProbs-MPI to complete the alignments, as well as the high-performance computing techniques used to accelerate it. Moreover, I provide some hints about the configuration parameters that should be used to guarantee high-performance executions.

Keywords: High-performance computing; MSAProbs-MPI; Message passing interface; Multiple sequence alignment; Multithreading; Parallel computing.

MeSH terms

  • Algorithms
  • Computational Biology / instrumentation
  • Computational Biology / methods*
  • Computing Methodologies
  • Sequence Alignment / instrumentation
  • Sequence Alignment / methods*
  • Software*