Middle-shallow feature aggregation in multimodality for face anti-spoofing

Sci Rep. 2023 Jun 19;13(1):9870. doi: 10.1038/s41598-023-36636-w.

Abstract

At present, most advanced algorithms for face anti-spoofing use stacked convolutions and residual structure to obtain complex characteristics of deep networks, and then distinguish liveness and deception. These methods ignore the shallow features that contain more detailed information. As a result, the model lacks sufficient fine-grained information, which affects the accuracy and robustness of the algorithm. In this paper, we use the simple features of the shallow network to increase the fine-grained information of the model, so as to improve the performance of the algorithm. First of all, the shallow features are spliced to the middle layer by "shortcut" structure to reserve more details for the middle layer features and improve their detail representation ability. Secondly, the network is initialized with the best pre-trained model parameters under unbalanced samples, and then trained on the balanced samples to improve the classification ability of the model. Finally, RS Block based on depthwise separable convolution is used to replace res module, and model parameters and floating point operations are reduced from 18G and 61 M to 1.9 M and 347 M. The algorithm is simulated on CASIA-SURF dataset, and the results show that the average classification error rate (ACER) is only 0.0008, TPR@FPR = 10E-4 reaches 0.9990, which is better than the previous face anti deception methods.