Multi-label learning with fuzzy hypergraph regularization for protein subcellular location prediction

IEEE Trans Nanobioscience. 2014 Dec;13(4):438-47. doi: 10.1109/TNB.2014.2341111. Epub 2014 Jul 31.

Abstract

Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Database Management Systems
  • Databases, Protein*
  • Fuzzy Logic
  • Information Storage and Retrieval / methods
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods*
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Sequence Alignment / methods
  • Sequence Analysis, Protein / methods*
  • Subcellular Fractions / chemistry*
  • Subcellular Fractions / metabolism*

Substances

  • Proteins