Predicting the extension of biomedical ontologies

PLoS Comput Biol. 2012;8(9):e1002630. doi: 10.1371/journal.pcbi.1002630. Epub 2012 Sep 13.

Abstract

Developing and extending a biomedical ontology is a very demanding task that can never be considered complete given our ever-evolving understanding of the life sciences. Extension in particular can benefit from the automation of some of its steps, thus releasing experts to focus on harder tasks. Here we present a strategy to support the automation of change capturing within ontology extension where the need for new concepts or relations is identified. Our strategy is based on predicting areas of an ontology that will undergo extension in a future version by applying supervised learning over features of previous ontology versions. We used the Gene Ontology as our test bed and obtained encouraging results with average f-measure reaching 0.79 for a subset of biological process terms. Our strategy was also able to outperform state of the art change capturing methods. In addition we have identified several issues concerning prediction of ontology evolution, and have delineated a general framework for ontology extension prediction. Our strategy can be applied to any biomedical ontology with versioning, to help focus either manual or semi-automated extension methods on areas of the ontology that need extension.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Computational Biology / methods*
  • Database Management Systems*
  • Evolution, Molecular*
  • Humans
  • Information Storage and Retrieval / methods*
  • Natural Language Processing*
  • Vocabulary, Controlled*

Grants and funding

This work was supported by the Portuguese Fundação para a Ciência e Tecnologia through the Multiannual Funding Programme, and the grant ref. SFRH/BD/42481/2007. The authors also wish to thank the European Commission for the financial support of the EPIWORK project under the Seventh Framework Programme (Grant #231807). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.