Pathway-specific protein domains are predictive for human diseases

PLoS Comput Biol. 2019 May 10;15(5):e1007052. doi: 10.1371/journal.pcbi.1007052. eCollection 2019 May.

Abstract

Protein domains are basic functional units of proteins. Many protein domains are pervasive among diverse biological processes, yet some are associated with specific pathways. Human complex diseases are generally viewed as pathway-level disorders. Therefore, we hypothesized that pathway-specific domains could be highly informative for human diseases. To test the hypothesis, we developed a network-based scoring scheme to quantify specificity of domain-pathway associations. We first generated domain profiles for human proteins, then constructed a co-pathway protein network based on the associations between domain profiles. Based on the score, we classified human protein domains into pathway-specific domains (PSDs) and non-specific domains (NSDs). We found that PSDs contained more pathogenic variants than NSDs. PSDs were also enriched for disease-associated mutations that disrupt protein-protein interactions (PPIs) and tend to have a moderate number of domain interactions. These results suggest that mutations in PSDs are likely to disrupt within-pathway PPIs, resulting in functional failure of pathways. Finally, we demonstrated the prediction capacity of PSDs for disease-associated genes with experimental validations in zebrafish. Taken together, the network-based quantitative method of modeling domain-pathway associations presented herein suggested underlying mechanisms of how protein domains associated with specific pathways influence mutational impacts on diseases via perturbations in within-pathway PPIs, and provided a novel genomic feature for interpreting genetic variants to facilitate the discovery of human disease genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Animals, Genetically Modified
  • Computational Biology
  • Coronary Artery Disease / etiology
  • Coronary Artery Disease / genetics
  • Coronary Artery Disease / metabolism
  • Disease / etiology*
  • Disease / genetics
  • Genetic Predisposition to Disease
  • Genetic Variation
  • Genome-Wide Association Study
  • Humans
  • Models, Animal
  • Models, Biological
  • Mutation
  • Polymorphism, Single Nucleotide
  • Protein Domains* / genetics
  • Protein Interaction Mapping
  • Protein Interaction Maps* / genetics
  • Zebrafish / genetics

Grants and funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean Government (MSIT) (NRF-2018M3C9A5064709, NRF-2018R1A5A2025079) to I.L. Funding for the open access charge has been provided by the National Research Foundation of Korea. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.