THz spectral data analysis and components unmixing based on non-negative matrix factorization methods

Spectrochim Acta A Mol Biomol Spectrosc. 2017 Apr 15:177:49-57. doi: 10.1016/j.saa.2017.01.009. Epub 2017 Jan 4.

Abstract

In many situations the THz spectroscopic data observed from complex samples represent the integrated result of several interrelated variables or feature components acting together. The actual information contained in the original data might be overlapping and there is a necessity to investigate various approaches for model reduction and data unmixing. The development and use of low-rank approximate nonnegative matrix factorization (NMF) and smooth constraint NMF (CNMF) algorithms for feature components extraction and identification in the fields of terahertz time domain spectroscopy (THz-TDS) data analysis are presented. The evolution and convergence properties of NMF and CNMF methods based on sparseness, independence and smoothness constraints for the resulting nonnegative matrix factors are discussed. For general NMF, its cost function is nonconvex and the result is usually susceptible to initialization and noise corruption, and may fall into local minima and lead to unstable decomposition. To reduce these drawbacks, smoothness constraint is introduced to enhance the performance of NMF. The proposed algorithms are evaluated by several THz-TDS data decomposition experiments including a binary system and a ternary system simulating some applications such as medicine tablet inspection. Results show that CNMF is more capable of finding optimal solutions and more robust for random initialization in contrast to NMF. The investigated method is promising for THz data resolution contributing to unknown mixture identification.

Keywords: Constraint NMF (CNMF); Nonnegative matrix factorization (NMF); Terahertz time domain spectroscopy (THz-TDS); Unmixing.