Supervised segmentation of phenotype descriptions for the human skeletal phenome using hybrid methods

BMC Bioinformatics. 2012 Oct 15:13:265. doi: 10.1186/1471-2105-13-265.

Abstract

Background: Over the course of the last few years there has been a significant amount of research performed on ontology-based formalization of phenotype descriptions. In order to fully capture the intrinsic value and knowledge expressed within them, we need to take advantage of their inner structure, which implicitly combines qualities and anatomical entities. The first step in this process is the segmentation of the phenotype descriptions into their atomic elements.

Results: We present a two-phase hybrid segmentation method that combines a series individual classifiers using different aggregation schemes (set operations and simple majority voting). The approach is tested on a corpus comprised of skeletal phenotype descriptions emerged from the Human Phenotype Ontology. Experimental results show that the best hybrid method achieves an F-Score of 97.05% in the first phase and F-Scores of 97.16% / 94.50% in the second phase.

Conclusions: The performance of the initial segmentation of anatomical entities and qualities (phase I) is not affected by the presence / absence of external resources, such as domain dictionaries. From a generic perspective, hybrid methods may not always improve the segmentation accuracy as they are heavily dependent on the goal and data characteristics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bone and Bones / abnormalities*
  • Data Interpretation, Statistical
  • Humans
  • Phenotype*