Multi-resolution visualization and analysis of biomolecular networks through hierarchical community detection and web-based graphical tools

PLoS One. 2020 Dec 22;15(12):e0244241. doi: 10.1371/journal.pone.0244241. eCollection 2020.

Abstract

The visual exploration and analysis of biomolecular networks is of paramount importance for identifying hidden and complex interaction patterns among proteins. Although many tools have been proposed for this task, they are mainly focused on the query and visualization of a single protein with its neighborhood. The global exploration of the entire network and the interpretation of its underlying structure still remains difficult, mainly due to the excessively large size of the biomolecular networks. In this paper we propose a novel multi-resolution representation and exploration approach that exploits hierarchical community detection algorithms for the identification of communities occurring in biomolecular networks. The proposed graphical rendering combines two types of nodes (protein and communities) and three types of edges (protein-protein, community-community, protein-community), and displays communities at different resolutions, allowing the user to interactively zoom in and out from different levels of the hierarchy. Links among communities are shown in terms of relationships and functional correlations among the biomolecules they contain. This form of navigation can be also combined by the user with a vertex centric visualization for identifying the communities holding a target biomolecule. Since communities gather limited-size groups of correlated proteins, the visualization and exploration of complex and large networks becomes feasible on off-the-shelf computer machines. The proposed graphical exploration strategies have been implemented and integrated in UNIPred-Web, a web application that we recently introduced for combining the UNIPred algorithm, able to address both integration and protein function prediction in an imbalance-aware fashion, with an easy to use vertex-centric exploration of the integrated network. The tool has been deeply amended from different standpoints, including the prediction core algorithm. Several tests on networks of different size and connectivity have been conducted to show off the vast potential of our methodology; moreover, enrichment analyses have been performed to assess the biological meaningfulness of detected communities. Finally, a CoV-human network has been embedded in the system, and a corresponding case study presented, including the visualization and the prediction of human host proteins that potentially interact with SARS-CoV2 proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • COVID-19 / genetics*
  • COVID-19 / metabolism
  • COVID-19 / virology
  • Humans
  • Internet*
  • Metabolic Networks and Pathways / genetics*
  • Proteins / genetics
  • Proteins / metabolism
  • SARS-CoV-2 / genetics*
  • SARS-CoV-2 / metabolism
  • SARS-CoV-2 / pathogenicity

Substances

  • Proteins

Grants and funding

This study was partially funded by University of Milano through the internal project "Machine Learning and Big Data Analysis for Bioinformatics" - PSR2019_DIP_010_GVALE.