FRDA: Fingerprint Region based Data Augmentation using explainable AI for FTIR based microplastics classification

Sci Total Environ. 2023 Oct 20:896:165340. doi: 10.1016/j.scitotenv.2023.165340. Epub 2023 Jul 4.

Abstract

Marine microplastics (MPs) contamination has become an enormous hazard to aquatic creatures and human life. For MP identification, many Machine learning (ML) based approaches have been proposed using Attenuated Total Reflection Fourier Transform Infrared Spectroscopy (ATR-FTIR). One major challenge for training MP identification models now is the imbalanced and inadequate samples in MP datasets, especially when these conditions are combined with copolymers and mixtures. To improve the ML performance in identifying MPs, data augmentation method is an effective approach. This work utilizes Explainable Artificial Intelligence (XAI) and Gaussian Mixture Models (GMM) to reveal the influence of FTIR spectral regions in identifying each type of MPs. Based on the identified regions, this work proposes a Fingerprint Region based Data Augmentation (FRDA) method to generate new FTIR data to supplement MP datasets. The evaluation results show that FRDA outperforms the existing spectral data augmentation approaches.

Keywords: Data augmentation; Data pre-processing; Deep learning; FTIR; Machine learning; Microplastic identification.