1D-CNN-based audio tampering detection using ENF signals

Sci Rep. 2024 May 16;14(1):11186. doi: 10.1038/s41598-024-60813-0.

Abstract

The extensive adoption of digital audio recording has revolutionized its application in digital forensics, particularly in civil litigation and criminal prosecution. Electric network frequency (ENF) has emerged as a reliable technique in the field of audio forensics. However, the absence of comprehensive ENF reference datasets limits current ENF-based methods. To address this, this study introduces ATD, a blind audio forensics framework based on a one-dimensional convolutional neural network (1D-CNN) model. ATD can identify phase mutations and waveform discontinuities within the tampered ENF signal, without relying on an ENF reference database. To enhance feature extraction, the framework incorporates characteristics of the fundamental harmonics of ENF signals. In addition, a denoising method termed ENF noise reduction (ENR) based on the variational mode decomposition (VMD) and robust filtering algorithm (RFA) is proposed to reduce the impact of external noise on embedded electric network frequency signals. This study investigates three distinct types of audio tampering-deletion, insertion, and replacement-culminating in the design of binary-class tampering detection scenarios and four-class tampering detection scenarios tailored to these tampering types. ATD achieves a tampering detection accuracy of over 93% in the four-class scenario and exceeds 96% in the binary-class scenario. The effectiveness, efficiency, adaptability, and robustness of ATD in the two and four classification scenarios have been confirmed by extensive experiments.