A novel storage method for near infrared spectroscopy chemometric models

Anal Chim Acta. 2010 Jun 4;668(2):149-54. doi: 10.1016/j.aca.2010.04.032. Epub 2010 Apr 24.

Abstract

Chemometric Modeling Markup Language (CMML) is developed by us for containing chemometrics models within one document through converting binary data into strings by base64 encode/decode algorithms to solve the interoperability issue in sharing chemometrics models. It provides a base functionality for storage of sampling, variable selection, pretreating, outlier and modeling parameters and data. With the help of base64 algorithm, the usability of CMML is in equilibrium with size by transforming the binary data into base64 encoded string. Due to the advantages of Extensible Markup Language (XML), models stored in CMML can be easily reused in various other software and programming languages as long as the programming language has XML parsing library. One can also use the XML Path Language (XPath) query language to select desired data from the CMML file effectively. The application of this language in near infrared spectroscopy model storage is implemented as a class in C++ language and available as open source software (http://code.google.com/p/cmml), and the implementations in other languages, such as MATLAB and R are in progress.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Information Storage and Retrieval* / methods
  • Models, Chemical*
  • Programming Languages*
  • Spectroscopy, Near-Infrared* / methods