Identification of Sequence Variants within Experimentally Validated Protein Interaction Sites Provides New Insights into Molecular Mechanisms of Disease Development

Mol Inform. 2017 Sep;36(9). doi: 10.1002/minf.201700017. Epub 2017 Apr 28.

Abstract

Protein interactions (PI) underlie complex biological processes. Protein interaction partners include DNA, RNA, ions, small chemical compounds, and proteins (protein-protein interactions; PPI). Analysis of sequence variants within regions corresponding to experimentally validated PI sites presents novel opportunities for understanding of complex diseases. Such information has not been systematically collected due to the fact that datasets are dispersed throughout databases and publications. Sequence variants and PI regions were obtained from the UniProt database. The location of the variants was compared to start and end positions of each PPI. Associations of sequence variants with phenotype were obtained from databases including COSMIC, GAD, PharmGKB, and dbSNP. We developed a catalogue of 603 sequence variants located within regions corresponding to experimentally validated PI sites, mostly PPI regions. These sequence variants were previously associated with risk for cancer, reproduction, ageing, renal, and immune system diseases. The developed catalogue connects information from different research papers and databases, represents a new layer of information and enables designing new hypotheses. It provides a baseline for prioritization of sequence variants, which may affect protein function and binding sites. The study contributes to the development of the proteogenomics field and provides new insights for understanding molecular mechanisms underlying disease development.

Keywords: In silico; disease associations; gene ontology analysis; protein-protein interactions (PPI); single nucleotide variant (SNV).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Genetic Predisposition to Disease
  • Humans
  • Molecular Docking Simulation / methods*
  • Polymorphism, Genetic*
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Proteome / chemistry*
  • Proteome / genetics
  • Proteome / metabolism
  • Sequence Analysis, Protein / methods*

Substances

  • Proteome