Hierarchical classification of gene ontology terms using the GOstruct method

J Bioinform Comput Biol. 2010 Apr;8(2):357-76. doi: 10.1142/s0219720010004744.

Abstract

Protein function prediction is an active area of research in bioinformatics. Yet, the transfer of annotation on the basis of sequence or structural similarity remains widely used as an annotation method. Most of today's machine learning approaches reduce the problem to a collection of binary classification problems: whether a protein performs a particular function, sometimes with a post-processing step to combine the binary outputs. We propose a method that directly predicts a full functional annotation of a protein by modeling the structure of the Gene Ontology hierarchy in the framework of kernel methods for structured-output spaces. Our empirical results show improved performance over a BLAST nearest-neighbor method, and over algorithms that employ a collection of binary classifiers as measured on the Mousefunc benchmark dataset.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Artificial Intelligence
  • Computational Biology
  • Databases, Protein
  • Mice
  • Models, Genetic
  • Models, Statistical
  • Neural Networks, Computer
  • Proteins / classification
  • Proteins / genetics*
  • Proteins / physiology*
  • Sequence Alignment / classification*
  • Sequence Alignment / statistics & numerical data*

Substances

  • Proteins