INGA 2.0: improving protein function prediction for the dark proteome

Nucleic Acids Res. 2019 Jul 2;47(W1):W373-W378. doi: 10.1093/nar/gkz375.

Abstract

Our current knowledge of complex biological systems is stored in a computable form through the Gene Ontology (GO) which provides a comprehensive description of genes function. Prediction of GO terms from the sequence remains, however, a challenging task, which is particularly critical for novel genomes. Here we present INGA 2.0, a new version of the INGA software for protein function prediction. INGA exploits homology, domain architecture, interaction networks and information from the 'dark proteome', like transmembrane and intrinsically disordered regions, to generate a consensus prediction. INGA was ranked in the top ten methods on both CAFA2 and CAFA3 blind tests. The new algorithm can process entire genomes in a few hours or even less when additional input files are provided. The new interface provides a better user experience by integrating filters and widgets to explore the graph structure of the predicted terms. The INGA web server, databases and benchmarking are available from URL: https://inga.bio.unipd.it/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computational Biology / methods*
  • Databases, Protein
  • Gene Ontology
  • Internet
  • Models, Molecular
  • Molecular Sequence Annotation*
  • Protein Interaction Mapping
  • Proteins / chemistry*
  • Proteins / physiology
  • Sequence Alignment
  • Sequence Analysis, Protein
  • Sequence Homology, Amino Acid
  • Software*
  • Structure-Activity Relationship

Substances

  • Proteins