On Representing Protein Folding Patterns Using Non-Linear Parametric Curves

IEEE/ACM Trans Comput Biol Bioinform. 2014 Nov-Dec;11(6):1218-28. doi: 10.1109/TCBB.2014.2338319.

Abstract

Proteins fold into complex three-dimensional shapes. Simplified representations of their shapes are central to rationalise, compare, classify, and interpret protein structures. Traditional methods to abstract protein folding patterns rely on representing their standard secondary structural elements (helices and strands of sheet) using line segments. This results in ignoring a significant proportion of structural information. The motivation of this research is to derive mathematically rigorous and biologically meaningful abstractions of protein folding patterns that maximize the economy of structural description and minimize the loss of structural information. We report on a novel method to describe a protein as a non-overlapping set of parametric three dimensional curves of varying length and complexity. Our approach to this problem is supported by information theory and uses the statistical framework of minimum message length (MML) inference. We demonstrate the effectiveness of our non-linear abstraction to support efficient and effective comparison of protein folding patterns on a large scale.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Humans
  • Models, Molecular
  • Nonlinear Dynamics
  • Protein Folding*
  • Proteins / chemistry*
  • Proteins / metabolism*

Substances

  • Proteins