Alignment of helical membrane protein sequences using AlignMe

PLoS One. 2013;8(3):e57731. doi: 10.1371/journal.pone.0057731. Epub 2013 Mar 4.

Abstract

Few sequence alignment methods have been designed specifically for integral membrane proteins, even though these important proteins have distinct evolutionary and structural properties that might affect their alignments. Existing approaches typically consider membrane-related information either by using membrane-specific substitution matrices or by assigning distinct penalties for gap creation in transmembrane and non-transmembrane regions. Here, we ask whether favoring matching of predicted transmembrane segments within a standard dynamic programming algorithm can improve the accuracy of pairwise membrane protein sequence alignments. We tested various strategies using a specifically designed program called AlignMe. An updated set of homologous membrane protein structures, called HOMEP2, was used as a reference for optimizing the gap penalties. The best of the membrane-protein optimized approaches were then tested on an independent reference set of membrane protein sequence alignments from the BAliBASE collection. When secondary structure (S) matching was combined with evolutionary information (using a position-specific substitution matrix (P)), in an approach we called AlignMePS, the resultant pairwise alignments were typically among the most accurate over a broad range of sequence similarities when compared to available methods. Matching transmembrane predictions (T), in addition to evolutionary information, and secondary-structure predictions, in an approach called AlignMePST, generally reduces the accuracy of the alignments of closely-related proteins in the BAliBASE set relative to AlignMePS, but may be useful in cases of extremely distantly related proteins for which sequence information is less informative. The open source AlignMe code is available at https://sourceforge.net/projects/alignme/, and at http://www.forrestlab.org, along with an online server and the HOMEP2 data set.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Hydrophobic and Hydrophilic Interactions
  • Molecular Sequence Data
  • Protein Structure, Secondary
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Sequence Alignment / methods*
  • Sequence Alignment / statistics & numerical data
  • Sequence Homology, Amino Acid
  • Software*

Substances

  • Proteins

Grants and funding

This study was supported by the German Research Foundation (DFG; www.dfg.de/en) Collaborative Research Center 807 “Transport and Communication across Biological Membranes”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.