Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites

J Bioinform Comput Biol. 2014 Oct;12(5):1440003. doi: 10.1142/S0219720014400034.

Abstract

Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

Keywords: Conservation patterns; functional sites; position-specific scoring matrix.

MeSH terms

  • Amino Acids / chemistry
  • Binding Sites / genetics
  • Catalytic Domain / genetics
  • Computational Biology
  • Conserved Sequence
  • Databases, Protein
  • Enzymes / chemistry
  • Enzymes / genetics
  • Enzymes / metabolism
  • Evolution, Molecular*
  • NAD / metabolism
  • Protein Binding
  • Proteins / chemistry
  • Proteins / genetics*
  • Proteins / metabolism*
  • Software
  • Support Vector Machine

Substances

  • Amino Acids
  • Enzymes
  • Proteins
  • NAD