Extraction and cleansing of data for a non-targeted analysis of high-resolution mass spectrometry data of wastewater

MethodsX. 2018 Apr 17:5:395-402. doi: 10.1016/j.mex.2018.04.008. eCollection 2018.

Abstract

We provide a workflow to extract unidentified signals from chromatography-high resolution mass spectrometry (LC-HRMS) data of wastewater samples as a pre-step of a non-targeted analysis of dissolved organic matter (DOM). We provide detailed methodology on data processing and cleanup using MS processing software MZmine 2 and an own set of functions in R developed for wastewater analysis. The processing involves signal extraction, linear mass correction, reduction of noise, grouping of isotopologues, molecular formula assignment and merging of replicates. The article contains software settings and reasoning behind the choice of data extraction options. The supplementary information contains a script for the correction of signal masses using internal standards and templates of internal standard lists. We included a reproducible example as an R notebook with data cleansing workflow and data exported from MZmine. The data were used according to the described methodology in the article "A non-targeted high-resolution mass spectrometry data analysis of dissolved organic matter in wastewater treatment" by Verkh et al., 2018. •Includes a linear mass correction algorithm for LC-HRMS signals.•Describes a pipeline of non-targeted processing of LC-HRMS data of wastewater using free software.•Provides tests and reasons for parameter choice in non-targeted LC-HRMS wastewater data extraction.

Keywords: Combined MZmine 2.26 and R extraction workflow of LC-HRMS wastewater data; MZmine; Molecular formula prediction; R; Water screening.