Mass spectrometer output file format mzML

Methods Mol Biol. 2010:604:319-31. doi: 10.1007/978-1-60761-444-9_22.

Abstract

Mass spectrometry is an important technique for analyzing proteins and other biomolecular compounds in biological samples. Each of the vendors of these mass spectrometers uses a different proprietary binary output file format, which has hindered data sharing and the development of open source software for downstream analysis. The solution has been to develop, with the full participation of academic researchers as well as software and hardware vendors, an open XML-based format for encoding mass spectrometer output files, and then to write software to use this format for archiving, sharing, and processing. This chapter presents the various components and information available for this format, mzML. In addition to the XML schema that defines the file structure, a controlled vocabulary provides clear terms and definitions for the spectral metadata, and a semantic validation rules mapping file allows the mzML semantic validator to insure that an mzML document complies with one of several levels of requirements. Complete documentation and example files insure that the format may be uniformly implemented. At the time of release, there already existed several implementations of the format and vendors have committed to supporting the format in their products.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Databases, Protein
  • Mass Spectrometry / methods
  • Mass Spectrometry / standards*
  • Proteomics / methods
  • Proteomics / standards*
  • Software*
  • Vocabulary, Controlled