Support vector machines for prediction of protein signal sequences and their cleavage sites

Peptides. 2003 Jan;24(1):159-61. doi: 10.1016/s0196-9781(02)00289-9.

Abstract

Given a nascent protein sequence, how can one predict its signal peptide or "Zipcode" sequence? This is an important problem for scientists to use signal peptides as a vehicle to find new drugs or to reprogram cells for gene therapy (see, e.g. K.C. Chou, Current Protein and Peptide Science 2002;3:615-22). In this paper, support vector machines (SVMs), a new machine learning method, is applied to approach this problem. The overall rate of correct prediction for 1939 secretary proteins and 1440 nonsecretary proteins was over 91%. It has not escaped our attention that the new method may also serve as a useful tool for further investigating many unclear details regarding the molecular mechanism of the ZIP code protein-sorting system in cells.

MeSH terms

  • Genetic Vectors*
  • Hydrolysis
  • Protein Sorting Signals*
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism*

Substances

  • Protein Sorting Signals
  • Proteins