Normalizing Gas-Chromatography-Mass Spectrometry Data: Method Choice can Alter Biological Inference

Michael J Noonan; Helga V Tinnesand; Christina D Buesching

doi:10.1002/bies.201700210

Normalizing Gas-Chromatography-Mass Spectrometry Data: Method Choice can Alter Biological Inference

Bioessays. 2018 Jun;40(6):e1700210. doi: 10.1002/bies.201700210. Epub 2018 Apr 30.

Authors

Michael J Noonan¹, Helga V Tinnesand², Christina D Buesching³

Affiliations

¹ Smithsonian Conservation Biology Institute, National Zoological Park, 1500 Remount Rd., Front Royal, VA 22630, USA.
² Faculty of Technology, Natural Sciences, and Maritime Sciences, Department of Natural Sciences and Environmental Health, University College of Southeast Norway, 3800 Bø i Telemark, Norway.
³ Wildlife Conservation Research Unit, Zoology Department, The Recanati-Kaplan Centre, University of Oxford, Tubney House, Abingdon Road, Tubney, Abingdon, OX13 5QL, UK.

PMID: 29709068
DOI: 10.1002/bies.201700210

Abstract

We demonstrate how different normalization techniques in GC-MS analysis impart unique properties to the data, influencing any biological inference. Using simulations, and empirical data, we compare the most commonly used techniques (Total Sum Normalization 'TSN'; Median Normalization 'MN'; Probabilistic Quotient Normalization 'PQN'; Internal Standard Normalization 'ISN'; External Standard Normalization 'ESN'; and a compositional data approach 'CODA'). When differences between biological classes are pronounced, ESN and ISN provides good results, but are less reliable for more subtly differentiated groups. MN, TSN, and CODA approaches produced variable results dependent on the structure of the data, and are prone to false positive biomarker identification. In contrast, PQN exhibits the lowest false positive rate, though with occasionally poor model performance. Because ESN requires extensive pre-planning, and offers only mixed reliability, and ISN, TSN, MN, and CODA approaches are prone to introducing artefactual differences, we recommend the use of PQN in GC-MS research.

Keywords: GC-MS; biomarker identification; log-ratio transformations; olfactory communication; pheromones; pre-processing; size effects.

Publication types

Research Support, Non-U.S. Gov't
Review

MeSH terms

Animals
Biomarkers / chemistry
Gas Chromatography-Mass Spectrometry / methods*

Substances

Biomarkers