Bucket Fuser: Statistical Signal Extraction for 1D 1H NMR Metabolomic Data

Metabolites. 2022 Aug 29;12(9):812. doi: 10.3390/metabo12090812.

Abstract

Untargeted metabolomics is a promising tool for identifying novel disease biomarkers and unraveling underlying pathomechanisms. Nuclear magnetic resonance (NMR) spectroscopy is particularly suited for large-scale untargeted metabolomics studies due to its high reproducibility and cost effectiveness. Here, one-dimensional (1D) 1H NMR experiments offer good sensitivity at reasonable measurement times. Their subsequent data analysis requires sophisticated data preprocessing steps, including the extraction of NMR features corresponding to specific metabolites. We developed a novel 1D NMR feature extraction procedure, called Bucket Fuser (BF), which is based on a regularized regression framework with fused group LASSO terms. The performance of the BF procedure was demonstrated using three independent NMR datasets and was benchmarked against existing state-of-the-art NMR feature extraction methods. BF dynamically constructs NMR metabolite features, the widths of which can be adjusted via a regularization parameter. BF consistently improved metabolite signal extraction, as demonstrated by our correlation analyses with absolutely quantified metabolites. It also yielded a higher proportion of statistically significant metabolite features in our differential metabolite analyses. The BF algorithm is computationally efficient and it can deal with small sample sizes. In summary, the Bucket Fuser algorithm, which is available as a supplementary python code, facilitates the fast and dynamic extraction of 1D NMR signals for the improved detection of metabolic biomarkers.

Keywords: NMR metabolomics; data preprocessing; feature extraction.