Bayesian approach to peak deconvolution and library search for high resolution gas chromatography - Mass spectrometry

Anal Chim Acta. 2017 Aug 29:983:76-90. doi: 10.1016/j.aca.2017.06.044. Epub 2017 Jun 30.

Abstract

A novel probabilistic Bayesian strategy is proposed to resolve highly coeluting peaks in high-resolution GC-MS (Orbitrap) data. Opposed to a deterministic approach, we propose to solve the problem probabilistically, using a complete pipeline. First, the retention time(s) for a (probabilistic) number of compounds for each mass channel are estimated. The statistical dependency between m/z channels was implied by including penalties in the model objective function. Second, Bayesian Information Criterion (BIC) is used as Occam's razor for the probabilistic assessment of the number of components. Third, a probabilistic set of resolved spectra, and their associated retention times are estimated. Finally, a probabilistic library search is proposed, computing the spectral match with a high resolution library. More specifically, a correlative measure was used that included the uncertainties in the least square fitting, as well as the probability for different proposals for the number of compounds in the mixture. The method was tested on simulated high resolution data, as well as on a set of pesticides injected in a GC-Orbitrap with high coelution. The proposed pipeline was able to detect accurately the retention times and the spectra of the peaks. For our case, with extremely high coelution situation, 5 out of the 7 existing compounds under the selected region of interest, were correctly assessed. Finally, the comparison with the classical methods of deconvolution (i.e., MCR and AMDIS) indicates a better performance of the proposed algorithm in terms of the number of correctly resolved compounds.

Keywords: Bayesian statistics; Compound identification; Deconvolution; GC-Orbitrap data; High resolution mass spectrometry.