Convolutional neural networks for vibrational spectroscopic data analysis

Anal Chim Acta. 2017 Feb 15:954:22-31. doi: 10.1016/j.aca.2016.12.010. Epub 2016 Dec 27.

Abstract

In this work we show that convolutional neural networks (CNNs) can be efficiently used to classify vibrational spectroscopic data and identify important spectral regions. CNNs are the current state-of-the-art in image classification and speech recognition and can learn interpretable representations of the data. These characteristics make CNNs a good candidate for reducing the need for preprocessing and for highlighting important spectral regions, both of which are crucial steps in the analysis of vibrational spectroscopic data. Chemometric analysis of vibrational spectroscopic data often relies on preprocessing methods involving baseline correction, scatter correction and noise removal, which are applied to the spectra prior to model building. Preprocessing is a critical step because even in simple problems using 'reasonable' preprocessing methods may decrease the performance of the final model. We develop a new CNN based method and provide an accompanying publicly available software. It is based on a simple CNN architecture with a single convolutional layer (a so-called shallow CNN). Our method outperforms standard classification algorithms used in chemometrics (e.g. PLS) in terms of accuracy when applied to non-preprocessed test data (86% average accuracy compared to the 62% achieved by PLS), and it achieves better performance even on preprocessed test data (96% average accuracy compared to the 89% achieved by PLS). For interpretability purposes, our method includes a procedure for finding important spectral regions, thereby facilitating qualitative interpretation of results.

Keywords: Convolutional neural networks; Preprocessing; Vibrational spectroscopy.