Information quality in proteomics

Brief Bioinform. 2008 Mar;9(2):174-88. doi: 10.1093/bib/bbn004.

Abstract

Proteomics, the study of the protein complement of a biological system, is generating increasing quantities of data from rapidly developing technologies employed in a variety of different experimental workflows. Experimental processes, e.g. for comparative 2D gel studies or LC-MS/MS analyses of complex protein mixtures, involve a number of steps: from experimental design, through wet and dry lab operations, to publication of data in repositories and finally to data annotation and maintenance. The presence of inaccuracies throughout the processing pipeline, however, results in data that can be untrustworthy, thus offsetting the benefits of high-throughput technology. While researchers and practitioners are generally aware of some of the information quality issues associated with public proteomics data, there are few accepted criteria and guidelines for dealing with them. In this article, we highlight factors that impact on the quality of experimental data and review current approaches to information quality management in proteomics. Data quality issues are considered throughout the lifecycle of a proteomics experiment, from experiment design and technique selection, through data analysis, to archiving and sharing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Database Management Systems
  • Electrophoresis, Gel, Two-Dimensional
  • Information Storage and Retrieval* / methods
  • Information Storage and Retrieval* / standards
  • Mass Spectrometry
  • Proteins / analysis
  • Proteomics* / instrumentation
  • Proteomics* / methods
  • Proteomics* / standards
  • Quality Control*
  • Software

Substances

  • Proteins