Inter-domain linker prediction using amino acid compositional index

Comput Biol Chem. 2015 Apr:55:23-30. doi: 10.1016/j.compbiolchem.2015.01.006. Epub 2015 Jan 24.

Abstract

Protein chains are generally long and consist of multiple domains. Domains are distinct structural units of a protein that can evolve and function independently. The accurate and reliable prediction of protein domain linkers and boundaries is often considered to be the initial step of protein tertiary structure and function predictions. In this paper, we introduce CISA as a method for predicting inter-domain linker regions solely from the amino acid sequence information. The method first computes the amino acid compositional index from the protein sequence dataset of domain-linker segments and the amino acid composition. A preference profile is then generated by calculating the average compositional index values along the amino acid sequence using a sliding window. Finally, the protein sequence is segmented into intervals and a simulated annealing algorithm is employed to enhance the prediction by finding the optimal threshold value for each segment that separates domains from inter-domain linkers. The method was tested on two standard protein datasets and showed considerable improvement over the state-of-the-art domain linker prediction methods.

Keywords: Amino acid composition; Compositional index; Domain linker prediction; Simulated annealing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence*
  • Computer Simulation
  • Machine Learning
  • Models, Molecular
  • Protein Conformation
  • Protein Structure, Tertiary
  • Software*