Development of data representation standards by the human proteome organization proteomics standards initiative

J Am Med Inform Assoc. 2015 May;22(3):495-506. doi: 10.1093/jamia/ocv001. Epub 2015 Feb 28.

Abstract

Objective: To describe the goals of the Proteomics Standards Initiative (PSI) of the Human Proteome Organization, the methods that the PSI has employed to create data standards, the resulting output of the PSI, lessons learned from the PSI's evolution, and future directions and synergies for the group.

Materials and methods: The PSI has 5 categories of deliverables that have guided the group. These are minimum information guidelines, data formats, controlled vocabularies, resources and software tools, and dissemination activities. These deliverables are produced via the leadership and working group organization of the initiative, driven by frequent workshops and ongoing communication within the working groups. Official standards are subjected to a rigorous document process that includes several levels of peer review prior to release.

Results: We have produced and published minimum information guidelines describing what information should be provided when making data public, either via public repositories or other means. The PSI has produced a series of standard formats covering mass spectrometer input, mass spectrometer output, results of informatics analysis (both qualitative and quantitative analyses), reports of molecular interaction data, and gel electrophoresis analyses. We have produced controlled vocabularies that ensure that concepts are uniformly annotated in the formats and engaged in extensive software development and dissemination efforts so that the standards can efficiently be used by the community.Conclusion In its first dozen years of operation, the PSI has produced many standards that have accelerated the field of proteomics by facilitating data exchange and deposition to data repositories. We look to the future to continue developing standards for new proteomics technologies and workflows and mechanisms for integration with other omics data types. Our products facilitate the translation of genomics and proteomics findings to clinical and biological phenotypes. The PSI website can be accessed at http://www.psidev.info.

Keywords: HUPO; data formats; data standards; guidelines; proteomics; proteomics standards initiative; standards; standards organization.

Publication types

  • Research Support, American Recovery and Reinvestment Act
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Databases as Topic / standards*
  • Guidelines as Topic
  • Humans
  • Mass Spectrometry / standards
  • Proteome*
  • Proteomics / standards*
  • Societies, Medical
  • Vocabulary, Controlled

Substances

  • Proteome