GLProbs: Aligning Multiple Sequences Adaptively

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):67-78. doi: 10.1109/TCBB.2014.2316820.

Abstract

This paper introduces a simple and effective approach to improve the accuracy of multiple sequence alignment. We use a natural measure to estimate the similarity of the input sequences, and based on this measure, we align the input sequences differently. For example, for inputs with high similarity, we consider the whole sequences and align them globally, while for those with moderately low similarity, we may ignore the flank regions and align them locally. To test the effectiveness of this approach, we have implemented a multiple sequence alignment tool called GLProbs and compared its performance with about one dozen leading alignment tools on three benchmark alignment databases, and GLProbs's alignments have the best scores in almost all testings. We have also evaluated the practicability of the alignments of GLProbs by applying the tool to three biological applications, namely phylogenetic trees construction, protein secondary structure prediction and the detection of high risk members for cervical cancer in the HPV-E6 family, and the results are very encouraging.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Markov Chains
  • Molecular Sequence Data
  • Phylogeny
  • Protein Structure, Secondary
  • Proteins / chemistry
  • Proteins / classification
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins