CONNJUR R: an annotation strategy for fostering reproducibility in bio-NMR-protein spectral assignment

J Biomol NMR. 2015 Oct;63(2):141-50. doi: 10.1007/s10858-015-9964-1. Epub 2015 Aug 8.

Abstract

Reproducibility is a cornerstone of the scientific method, essential for validation of results by independent laboratories and the sine qua non of scientific progress. A key step toward reproducibility of biomolecular NMR studies was the establishment of public data repositories (PDB and BMRB). Nevertheless, bio-NMR studies routinely fall short of the requirement for reproducibility that all the data needed to reproduce the results are published. A key limitation is that considerable metadata goes unpublished, notably manual interventions that are typically applied during the assignment of multidimensional NMR spectra. A general solution to this problem has been elusive, in part because of the wide range of approaches and software packages employed in the analysis of protein NMR spectra. Here we describe an approach for capturing missing metadata during the assignment of protein NMR spectra that can be generalized to arbitrary workflows, different software packages, other biomolecules, or other stages of data analysis in bio-NMR. We also present extensions to the NMR-STAR data dictionary that enable machine archival and retrieval of the "missing" metadata.

Keywords: Analysis; CONNJUR; Data model; NMR-STAR; Reproducibility.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods
  • Databases, Protein
  • Humans
  • Nuclear Magnetic Resonance, Biomolecular* / methods
  • Proteins / chemistry*
  • Reproducibility of Results

Substances

  • Proteins