Detecting intermediate protein conformations using algebraic topology

BMC Bioinformatics. 2017 Dec 6;18(Suppl 15):502. doi: 10.1186/s12859-017-1918-z.

Abstract

Background: Understanding protein structure and dynamics is essential for understanding their function. This is a challenging task due to the high complexity of the conformational landscapes of proteins and their rugged energy levels. In particular, it is important to detect highly populated regions which could correspond to intermediate structures or local minima.

Results: We present a hierarchical clustering and algebraic topology based method that detects regions of interest in protein conformational space. The method is based on several techniques. We use coarse grained protein conformational search, efficient robust dimensionality reduction and topological analysis via persistent homology as the main tools. We use two dimensionality reduction methods as well, robust Principal Component Analysis (PCA) and Isomap, to generate a reduced representation of the data while preserving most of the variance in the data.

Conclusions: Our hierarchical clustering method was able to produce compact, well separated clusters for all the tested examples.

Keywords: Algebraic topology; Clustering; Dimensionality reduction; Protein conformational sampling; Protein structure.

MeSH terms

  • Cluster Analysis
  • Computational Biology / methods*
  • Principal Component Analysis
  • Protein Conformation*
  • Proteins* / chemistry
  • Proteins* / metabolism

Substances

  • Proteins