Protein Sequence Comparison Based on Physicochemical Properties and the Position-Feature Energy Matrix

Sci Rep. 2017 Apr 10:7:46237. doi: 10.1038/srep46237.

Abstract

We develop a novel position-feature-based model for protein sequences by employing physicochemical properties of 20 amino acids and the measure of graph energy. The method puts the emphasis on sequence order information and describes local dynamic distributions of sequences, from which one can get a characteristic B-vector. Afterwards, we apply the relative entropy to the sequences representing B-vectors to measure their similarity/dissimilarity. The numerical results obtained in this study show that the proposed methods leads to meaningful results compared with competitors such as Clustal W.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Antifreeze Proteins / chemistry
  • Chemical Phenomena*
  • Entropy
  • Humans
  • Isoelectric Point
  • Numerical Analysis, Computer-Assisted
  • Phylogeny
  • Proteins / chemistry*
  • Sequence Homology, Amino Acid*
  • Transferrin / chemistry
  • beta-Globins / chemistry

Substances

  • Antifreeze Proteins
  • Proteins
  • Transferrin
  • beta-Globins