Finding genomic ontology terms in text using evidence content

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S21. doi: 10.1186/1471-2105-6-S1-S21. Epub 2005 May 24.

Abstract

Background: The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These systems need first to recognize the biological entities and properties in the text, and then decide which pairs represent valid annotations.

Methods: This document introduces a novel unsupervised method for recognizing biological properties in unstructured text, involving the evidence content of their names.

Results: This document shows the results obtained by the application of our method to BioCreative tasks 2.1 and 2.2, where it identified Gene Ontology annotations and their evidence in a set of articles.

Conclusion: From the performance obtained in BioCreative, we concluded that an automatic annotation system can effectively use our method to identify biological properties in unstructured text.

MeSH terms

  • Computational Biology / methods*
  • Genomics / classification*
  • Pattern Recognition, Automated / methods*
  • Periodicals as Topic*
  • Terminology as Topic*
  • Textbooks as Topic*
  • Writing