Comparative Study of Popular Deep Learning Models for Machining Roughness Classification Using Sound and Force Signals

Binayak Bhandari

doi:10.3390/mi12121484

Comparative Study of Popular Deep Learning Models for Machining Roughness Classification Using Sound and Force Signals

Micromachines (Basel). 2021 Nov 29;12(12):1484. doi: 10.3390/mi12121484.

Author

Binayak Bhandari¹

Affiliation

¹ Department of Railroad Engineering & Transport Management, Woosong University, Daejeon 300718, Korea.

Abstract

This study compared popular Deep Learning (DL) architectures to classify machining surface roughness using sound and force data. The DL architectures considered in this study include Multi-Layer Perceptron (MLP), Convolution Neural Network (CNN), Long Short-Term Memory (LSTM), and transformer. The classification was performed on the sound and force data generated during machining aluminum sheets for different levels of spindle speed, feed rate, depth of cut, and end-mill diameter, and it was trained on 30 s machining data (10-40 s) of the machining experiments. Since a raw audio waveform is seldom used in DL models, Mel-Spectrogram and Mel Frequency Cepstral Coefficients (MFCCs) audio feature extraction techniques were used in the DL models. The results of DL models were compared for the training-validation accuracy, training epochs, and training parameters of each model. Although the roughness classification by all the DL models was satisfactory (except for CNN with Mel-Spectrogram), the transformer-based modes had the highest training (>96%) and validation accuracies (≈90%). The CNN model with Mel-Spectrogram exhibited the worst training and inference accuracy, which is influenced by limited training data. Confusion matrices were plotted to observe the classification accuracy visually. The confusion matrices showed that the transformer model trained on Mel-Spectrogram and the transformer model trained on MFCCs correctly predicted 366 (or 91.5%) and 371 (or 92.7%) out of 400 test samples. This study also highlights the suitability and superiority of the transformer model for time series sound and force data and over other DL models.

Keywords: CNN; Deep Learning; LSTM; MLP; Smart Factory; attention mechanisms; classification; confusion matrix; precision machining; sound feature extraction.