Prediction of Apoptosis Protein Subcellular Localization with Multilayer Sparse Coding and Oversampling Approach

Biomed Res Int. 2019 Jan 30:2019:2436924. doi: 10.1155/2019/2436924. eCollection 2019.

Abstract

The prediction of apoptosis protein subcellular localization plays an important role in understanding the progress in cell proliferation and death. Recently computational approaches to this issue have become very popular, since the traditional biological experiments are so costly and time-consuming that they cannot catch up with the growth rate of sequence data anymore. In order to improve the prediction accuracy of apoptosis protein subcellular localization, we proposed a sparse coding method combined with traditional feature extraction algorithm to complete the sparse representation of apoptosis protein sequences, using multilayer pooling based on different sizes of dictionaries to integrate the processed features, as well as oversampling approach to decrease the influences caused by unbalanced data sets. Then the extracted features were input to a support vector machine to predict the subcellular localization of the apoptosis protein. The experiment results obtained by Jackknife test on two benchmark data sets indicate that our method can significantly improve the accuracy of the apoptosis protein subcellular localization prediction.

MeSH terms

  • Algorithms
  • Amino Acid Sequence / genetics
  • Apoptosis / genetics*
  • Apoptosis Regulatory Proteins / genetics*
  • Apoptosis Regulatory Proteins / ultrastructure
  • Cell Proliferation / genetics
  • Computational Biology*
  • Databases, Protein
  • Humans
  • Protein Transport / genetics
  • Support Vector Machine

Substances

  • Apoptosis Regulatory Proteins