SNOSite: exploiting maximal dependence decomposition to identify cysteine S-nitrosylation with substrate site specificity

PLoS One. 2011;6(7):e21849. doi: 10.1371/journal.pone.0021849. Epub 2011 Jul 15.

Abstract

S-nitrosylation, the covalent attachment of a nitric oxide to (NO) the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-nitrosylation remains unknown. Based on a total of 586 experimentally identified S-nitrosylation sites from SNAP/L-cysteine-stimulated mouse endothelial cells, this work presents an informatics investigation on S-nitrosylation sites including structural factors such as the flanking amino acids composition, the accessible surface area (ASA) and physicochemical properties, i.e. positive charge and side chain interaction parameter. Due to the difficulty to obtain the conserved motifs by conventional motif analysis, maximal dependence decomposition (MDD) has been applied to obtain statistically significant conserved motifs. Support vector machine (SVM) is applied to generate predictive model for each MDD-clustered motif. According to five-fold cross-validation, the MDD-clustered SVMs could achieve an accuracy of 0.902, and provides a promising performance in an independent test set. The effectiveness of the model was demonstrated on the correct identification of previously reported S-nitrosylation sites of Bos taurus dimethylarginine dimethylaminohydrolase 1 (DDAH1) and human hemoglobin subunit beta (HBB). Finally, the MDD-clustered model was adopted to construct an effective web-based tool, named SNOSite (http://csb.cse.yzu.edu.tw/SNOSite/), for identifying S-nitrosylation sites on the uncharacterized protein sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amidohydrolases / chemistry
  • Amidohydrolases / metabolism
  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Cattle
  • Cluster Analysis
  • Computational Biology / methods*
  • Cysteine / metabolism*
  • Databases, Protein
  • Humans
  • Internet
  • Mice
  • Models, Biological
  • Molecular Sequence Data
  • Nitrosation
  • Reproducibility of Results
  • Software*
  • Solvents
  • Substrate Specificity

Substances

  • Solvents
  • Amidohydrolases
  • dimethylargininase
  • Cysteine