Algorithm for comprehensive analysis of datasets from hyphenated high resolution mass spectrometric techniques using single ion profiles and cluster analysis

J Chromatogr A. 2016 Jan 15:1429:134-41. doi: 10.1016/j.chroma.2015.12.005. Epub 2015 Dec 11.

Abstract

Various algorithms have been developed to improve the quantity and quality of information that can be extracted from complex datasets obtained using hyphenated mass spectrometric techniques. While different approaches are possible, the key step often consists in arranging the data into a large series of profiles known as extracted ion profiles. Those profiles, similar to mono-dimensional separation profiles, are then processed to detect potential chromatographic peaks. This allows extracting from the dataset a large number of peaks that are characteristics of the compounds that have been separated. However, with mass spectrometry (MS) detection, the response is usually a complex signal whose pattern depends on the analyte, the MS instrument and the ionization method. When converted to ionic profiles, a single separated analyte will have multiple images at different m/z range. In this manuscript we present a hierarchical agglomerative clustering algorithm to group profiles with very similar feature. Each group aims to contain all profiles that are due to the transport and monitoring of a single analyte. Clustering results are then used to generate a 2 dimensional representation, called clusters plot, which allows an in-depth analysis of the MS dataset including the visualization of poorly separated compounds even when their intensity differs by more than two orders of magnitude. The usefulness of this new approach has been validated with data from capillary electrophoresis time of flight mass spectrometry hyphenated via an electrospray ionization. Using a mixture of 17 low molecular endogenous compounds it was verified that ionic profiles belonging to each compounds were correctly clustered even with very low degree of separation (R below 0.03). The approach was also validated using a urine sample. While with the total ion profile 15 peaks could be distinguished, 70 clusters were obtained allowing a much thorough analysis. In this particular example, the total computing took less than 10 min.

Keywords: Chemometrics; Data mining; Hyphenated techniques; Representation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chemistry Techniques, Analytical / methods*
  • Cluster Analysis*
  • Electrophoresis, Capillary
  • Ions / chemistry*
  • Mass Spectrometry*

Substances

  • Ions