Towards an efficient compression of 3D coordinates of macromolecular structures

PLoS One. 2017 Mar 31;12(3):e0174846. doi: 10.1371/journal.pone.0174846. eCollection 2017.

Abstract

The size and complexity of 3D macromolecular structures available in the Protein Data Bank is constantly growing. Current tools and file formats have reached limits of scalability. New compression approaches are required to support the visualization of large molecular complexes and enable new and scalable means for data analysis. We evaluated a series of compression techniques for coordinates of 3D macromolecular structures and identified the best performing approaches. By balancing compression efficiency in terms of the decompression speed and compression ratio, and code complexity, our results provide the foundation for a novel standard to represent macromolecular coordinates in a compact and useful file format.

MeSH terms

  • Algorithms
  • Data Compression
  • Databases, Protein*
  • Magnetic Resonance Spectroscopy
  • Models, Theoretical
  • Molecular Structure
  • Protein Structure, Secondary
  • Protein Structure, Tertiary