A survey on the algorithm and development of multiple sequence alignment

Yongqing Zhang; Qiang Zhang; Jiliu Zhou; Quan Zou

doi:10.1093/bib/bbac069

A survey on the algorithm and development of multiple sequence alignment

Brief Bioinform. 2022 May 13;23(3):bbac069. doi: 10.1093/bib/bbac069.

Authors

Yongqing Zhang^{1

2}, Qiang Zhang¹, Jiliu Zhou¹, Quan Zou³

Affiliations

¹ School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China.
² School of Computer Science and Engineering, University of Electronic Science and Technology of China, 611731, Chengdu, China.
³ Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054, Chengdu, China.

PMID: 35272347
DOI: 10.1093/bib/bbac069

Abstract

Multiple sequence alignment (MSA) is an essential cornerstone in bioinformatics, which can reveal the potential information in biological sequences, such as function, evolution and structure. MSA is widely used in many bioinformatics scenarios, such as phylogenetic analysis, protein analysis and genomic analysis. However, MSA faces new challenges with the gradual increase in sequence scale and the increasing demand for alignment accuracy. Therefore, developing an efficient and accurate strategy for MSA has become one of the research hotspots in bioinformatics. In this work, we mainly summarize the algorithms for MSA and its applications in bioinformatics. To provide a structured and clear perspective, we systematically introduce MSA's knowledge, including background, database, metric and benchmark. Besides, we list the most common applications of MSA in the field of bioinformatics, including database searching, phylogenetic analysis, genomic analysis, metagenomic analysis and protein analysis. Furthermore, we categorize and analyze classical and state-of-the-art algorithms, divided into progressive alignment, iterative algorithm, heuristics, machine learning and divide-and-conquer. Moreover, we also discuss the challenges and opportunities of MSA in bioinformatics. Our work provides a comprehensive survey of MSA applications and their relevant algorithms. It could bring valuable insights for researchers to contribute their knowledge to MSA and relevant studies.

Keywords: heuristic; iterative algorithm; multiple sequence alignment; progressive alignment.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Computational Biology*
Machine Learning
Phylogeny
Sequence Alignment