Predicting protein structural classes based on complex networks and recurrence analysis

J Theor Biol. 2016 Sep 7:404:375-382. doi: 10.1016/j.jtbi.2016.06.018. Epub 2016 Jun 16.

Abstract

Protein sequences are divided into four structural classes. The determination of class is a challenging and beneficial task in the bioinformatics field. Several methods have been proposed to this end, but most utilize too many features and produce unsuitable results. In the present, features are extracted based on the predicted secondary structures. At first, predicted secondary structure sequences are mapped into two time series by the chaos game representation. Then, a recurrence matrix is calculated from each of the time series. The recurrence matrix is identified with the adjacency matrix of a complex network and measures are applied for the characterization of complex networks to these recurrence matrixes. For a given protein sequence, a total of 24 characteristic features can be calculated and these are fed into Fisher's discriminated analysis algorithm for classification. To examine the proposed method, two widely used low similarity benchmark datasets design and test its performance. A comparison with the results of existing methods shows that the current study's approach provides a satisfactory performance for protein structural class prediction.

Keywords: Chaos game representation; Complex network; Prediction protein structural class; Recurrence matrix.

MeSH terms

  • Computational Biology / methods*
  • Databases, Protein
  • Nonlinear Dynamics
  • Protein Interaction Maps*
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / classification*
  • Time Factors

Substances

  • Proteins