Protein family neighborhood analyzer-ProFaNA

PeerJ. 2023 Jul 21:11:e15715. doi: 10.7717/peerj.15715. eCollection 2023.

Abstract

Background: Functionally related genes are well known to be often grouped in close vicinity in the genomes, particularly in prokaryotes. Notwithstanding the diverse evolutionary mechanisms leading to this phenomenon, it can be used to predict functions of uncharacterized genes.

Methods: Here, we provide a simple but robust statistical approach that leverages the vast amounts of genomic data available today. Considering a protein domain as a functional unit, one can explore other functional units (domains) that significantly often occur within the genomic neighborhoods of the queried domain. This analysis can be performed across different taxonomic levels. Provisions can also be made to correct for the uneven sampling of the taxonomic space by genomic sequencing projects that often focus on large numbers of very closely related strains, e.g., pathogenic ones. To this end, an optional procedure for averaging occurrences within subtaxa is available.

Results: Several examples show this approach can provide useful functional predictions for uncharacterized gene families, and how to combine this information with other approaches. The method is made available as a web server at http://bioinfo.sggw.edu.pl/neighborhood_analysis.

Keywords: Comparative genomics; Gene function prediction; Genomic neighborhoods; Protein domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Chromosome Mapping / methods
  • Genome*
  • Genomics / methods
  • Proteins* / genetics

Substances

  • Proteins

Grants and funding

This work was supported in part by the Polish National Science Centre (grant no. 2020/37/B/NZ1/03603). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.