Updates to the Alliance of Genome Resources central infrastructure

Genetics. 2024 May 7;227(1):iyae049. doi: 10.1093/genetics/iyae049.

Abstract

The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, Caenorhabditis elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and application programming interfaces (APIs). Here, we focus on developments over the last 2 years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress toward a central persistent database to support curation, the data modeling that underpins harmonization, and progress toward a state-of-the-art literature curation system with integrated artificial intelligence and machine learning (AI/ML).

Keywords: Caenorhabditis elegans; Drosophila; data integration; database; knowledgebase; mouse; software; text mining; yeast; zebrafish.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Genetic*
  • Genome
  • Genomics* / methods
  • Mice
  • Software