A novel numerical model for protein sequences analysis based on spherical coordinates and multiple physicochemical properties of amino acids

Biopolymers. 2019 Aug;110(8):e23282. doi: 10.1002/bip.23282. Epub 2019 Apr 12.

Abstract

How to characterize short protein sequences to make an effective connection to their functions is an unsolved problem. Here we propose to map the physicochemical properties of each amino acid onto unit spheres so that each protein sequence can be represented quantitatively. We demonstrate the usefulness of this representation by applying it to the prediction of cell penetrating peptides. We show that its combination with traditional composition features yields the best performance across different datasets, among several methods compared. For the convenience of users, a web server has been established for automatic calculations of the proposed features at http://biophy.dzu.edu.cn/SNumD/.

Keywords: cell penetrating peptides; graphical representation; protein sequence analysis; sequence feature.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Proteins / chemistry*
  • Sequence Analysis, Protein / methods
  • User-Computer Interface

Substances

  • Proteins