Formalization of gene regulation knowledge using ontologies and gene ontology causal activity models

Biochim Biophys Acta Gene Regul Mech. 2021 Nov-Dec;1864(11-12):194766. doi: 10.1016/j.bbagrm.2021.194766. Epub 2021 Oct 25.

Abstract

Gene regulation computational research requires handling and integrating large amounts of heterogeneous data. The Gene Ontology has demonstrated that ontologies play a fundamental role in biological data interoperability and integration. Ontologies help to express data and knowledge in a machine processable way, which enables complex querying and advanced exploitation of distributed data. Contributing to improve data interoperability in gene regulation is a major objective of the GREEKC Consortium, which aims to develop a standardized gene regulation knowledge commons. GREEKC proposes the use of ontologies and semantic tools for developing interoperable gene regulation knowledge models, which should support data annotation. In this work, we study how such knowledge models can be generated from cartoons of gene regulation scenarios. The proposed method consists of generating descriptions in natural language of the cartoons; extracting the entities from the texts; finding those entities in existing ontologies to reuse as much content as possible, especially from well known and maintained ontologies such as the Gene Ontology, the Sequence Ontology, the Relations Ontology and ChEBI; and implementation of the knowledge models. The models have been implemented using Protégé, a general ontology editor, and Noctua, the tool developed by the Gene Ontology Consortium for the development of causal activity models to capture more comprehensive annotations of genes and link their activities in a causal framework for Gene Ontology Annotations. We applied the method to two gene regulation scenarios and illustrate how to apply the models generated to support the annotation of data from research articles.

Keywords: Bioinformatics; Gene ontology; Gene regulation; Knowledge representation; Ontologies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Curation
  • Gene Expression Regulation*
  • Gene Ontology
  • Models, Genetic*
  • Molecular Sequence Annotation