Deep Learning-Based Speech Enhancement of an Extrinsic Fabry-Perot Interferometric Fiber Acoustic Sensor System

Sensors (Basel). 2023 Mar 29;23(7):3574. doi: 10.3390/s23073574.

Abstract

To achieve high-quality voice communication technology without noise interference in flammable, explosive and strong electromagnetic environments, the speech enhancement technology of a fiber-optic external Fabry-Perot interferometric (EFPI) acoustic sensor based on deep learning is studied in this paper. The combination of a complex-valued convolutional neural network and a long short-term memory (CV-CNN-LSTM) model is proposed for speech enhancement in the EFPI acoustic sensing system. Moreover, the 3 × 3 coupler algorithm is used to demodulate voice signals. Then, the short-time Fourier transform (STFT) spectrogram features of voice signals are divided into a training set and a test set. The training set is input into the established CV-CNN-LSTM model for model training, and the test set is input into the trained model for testing. The experimental findings reveal that the proposed CV-CNN-LSTM model demonstrates exceptional speech enhancement performance, boasting an average Perceptual Evaluation of Speech Quality (PESQ) score of 3.148. In comparison to the CV-CNN and CV-LSTM models, this innovative model achieves a remarkable PESQ score improvement of 9.7% and 11.4%, respectively. Furthermore, the average Short-Time Objective Intelligibility (STOI) score witnesses significant enhancements of 4.04 and 2.83 when contrasted with the CV-CNN and CV-LSTM models, respectively.

Keywords: CV-CNN; CV-LSTM; external Fabry–Perot interferometer; optical fiber sensor; speech enhancement.