Classifying RNA-binding proteins based on electrostatic properties

PLoS Comput Biol. 2008 Aug 8;4(8):e1000146. doi: 10.1371/journal.pcbi.1000146.

Abstract

Protein structure can provide new insight into the biological function of a protein and can enable the design of better experiments to learn its biological roles. Moreover, deciphering the interactions of a protein with other molecules can contribute to the understanding of the protein's function within cellular processes. In this study, we apply a machine learning approach for classifying RNA-binding proteins based on their three-dimensional structures. The method is based on characterizing unique properties of electrostatic patches on the protein surface. Using an ensemble of general protein features and specific properties extracted from the electrostatic patches, we have trained a support vector machine (SVM) to distinguish RNA-binding proteins from other positively charged proteins that do not bind nucleic acids. Specifically, the method was applied on proteins possessing the RNA recognition motif (RRM) and successfully classified RNA-binding proteins from RRM domains involved in protein-protein interactions. Overall the method achieves 88% accuracy in classifying RNA-binding proteins, yet it cannot distinguish RNA from DNA binding proteins. Nevertheless, by applying a multiclass SVM approach we were able to classify the RNA-binding proteins based on their RNA targets, specifically, whether they bind a ribosomal RNA (rRNA), a transfer RNA (tRNA), or messenger RNA (mRNA). Finally, we present here an innovative approach that does not rely on sequence or structural homology and could be applied to identify novel RNA-binding proteins with unique folds and/or binding motifs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence / physiology
  • Animals
  • Artificial Intelligence
  • Binding Sites
  • Computational Biology / methods*
  • DNA-Binding Proteins / chemistry
  • Databases, Protein
  • Humans
  • Pattern Recognition, Automated / methods
  • Protein Interaction Domains and Motifs* / physiology
  • RNA, Messenger / chemistry
  • RNA, Messenger / metabolism
  • RNA, Ribosomal / chemistry
  • RNA, Ribosomal / metabolism
  • RNA, Transfer / chemistry
  • RNA, Transfer / metabolism
  • RNA-Binding Proteins / chemistry
  • RNA-Binding Proteins / classification*
  • Sequence Analysis, Protein
  • Static Electricity
  • Structure-Activity Relationship

Substances

  • DNA-Binding Proteins
  • RNA, Messenger
  • RNA, Ribosomal
  • RNA-Binding Proteins
  • RNA, Transfer