SVMCRYS: an SVM approach for the prediction of protein crystallization propensity from protein sequence

Protein Pept Lett. 2010 Apr;17(4):423-30. doi: 10.2174/092986610790963726.

Abstract

X-ray crystallography is the most widely used method for protein 3-dimensional structure determination. Selection of target protein that can yield high quality crystal for X-ray crystallography is a challenging task. Prediction of protein crystallization propensity from sequence information is useful for the selection of target protein for crystallization. Recently, support vector machines have been widely used to solve various biological problems. In this work, we present a SVMCRYS method which use support vector machine to classify protein sequence into 'amenable to crystallization' and 'resistant to crystallization'. SVMCRYS was trained on a dataset containing 728 sequences that gave diffraction quality crystal and 728 sequences where work had been stopped before obtaining crystal. The performance of SVMCRYS method was compared with other sequence-based crystallization prediction methods such as SECRET, CRYSTALP, OB-Score, ParCrys and XtalPred using three different datasets. SVMCRYS achieved better prediction rate with higher sensitivity and specificity. Our analysis suggests that SVMCRYS can be used to predict proteins which are amenable to crystallization and proteins which are difficult for crystallization. The SVMCRYS software, dataset and feature set can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence*
  • Artificial Intelligence*
  • Crystallography, X-Ray / methods*
  • Databases, Protein
  • Nuclear Magnetic Resonance, Biomolecular
  • Proteins / chemistry*
  • Proteins / metabolism
  • ROC Curve
  • Reproducibility of Results
  • Structure-Activity Relationship

Substances

  • Proteins