Efficient Violence Detection in Surveillance

Sensors (Basel). 2022 Mar 13;22(6):2216. doi: 10.3390/s22062216.

Abstract

Intelligent video surveillance systems are rapidly being introduced to public places. The adoption of computer vision and machine learning techniques enables various applications for collected video features; one of the major is safety monitoring. The efficacy of violent event detection is measured by the efficiency and accuracy of violent event detection. In this paper, we present a novel architecture for violence detection from video surveillance cameras. Our proposed model is a spatial feature extracting a U-Net-like network that uses MobileNet V2 as an encoder followed by LSTM for temporal feature extraction and classification. The proposed model is computationally light and still achieves good results-experiments showed that an average accuracy is 0.82 ± 2% and average precision is 0.81 ± 3% using a complex real-world security camera footage dataset based on RWF-2000.

Keywords: LSTM; U-Net; computer vision; deep learning; intelligent video surveillance; violence detection; violent behavior.

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer*
  • Violence