Use of a Probabilistic Motif Search to Identify Histidine Phosphotransfer Domain-Containing Proteins

PLoS One. 2016 Jan 11;11(1):e0146577. doi: 10.1371/journal.pone.0146577. eCollection 2016.

Abstract

The wealth of newly obtained proteomic information affords researchers the possibility of searching for proteins of a given structure or function. Here we describe a general method for the detection of a protein domain of interest in any species for which a complete proteome exists. In particular, we apply this approach to identify histidine phosphotransfer (HPt) domain-containing proteins across a range of eukaryotic species. From the sequences of known HPt domains, we created an amino acid occurrence matrix which we then used to define a conserved, probabilistic motif. Examination of various organisms either known to contain (plant and fungal species) or believed to lack (mammals) HPt domains established criteria by which new HPt candidates were identified and ranked. Search results using a probabilistic motif matrix compare favorably with data to be found in several commonly used protein structure/function databases: our method identified all known HPt proteins in the Arabidopsis thaliana proteome, confirmed the absence of such motifs in mice and humans, and suggests new candidate HPts in several organisms. Moreover, probabilistic motif searching can be applied more generally, in a manner both readily customized and computationally compact, to other protein domains; this utility is demonstrated by our identification of histones in a range of eukaryotic organisms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Motifs
  • Amino Acid Sequence
  • Animals
  • Arabidopsis / chemistry
  • Computational Biology / methods*
  • Dictyostelium
  • Drosophila melanogaster
  • Histidine / chemistry*
  • Histones / chemistry
  • Humans
  • Mice
  • Molecular Sequence Data
  • Phosphotransferases / chemistry*
  • Probability
  • Protein Structure, Tertiary
  • Proteome
  • Proteomics
  • Saccharomyces cerevisiae
  • Software
  • Zebrafish

Substances

  • Histones
  • Proteome
  • Histidine
  • Phosphotransferases

Grants and funding

This research was supported (in part) by a grant to DIR from the Amherst College Faculty Research Award Program, as funded by The H. Axel Schupf '57 Fund for Intellectual Life. DS received support from Amherst College to complete this work in partial fulfillment of the requirements for the degree Bachelor of Arts with Honors. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.