Abnormal Activity Recognition from Surveillance Videos Using Convolutional Neural Network

Sensors (Basel). 2021 Dec 11;21(24):8291. doi: 10.3390/s21248291.

Abstract

Background and motivation: Every year, millions of Muslims worldwide come to Mecca to perform the Hajj. In order to maintain the security of the pilgrims, the Saudi government has installed about 5000 closed circuit television (CCTV) cameras to monitor crowd activity efficiently.

Problem: As a result, these cameras generate an enormous amount of visual data through manual or offline monitoring, requiring numerous human resources for efficient tracking. Therefore, there is an urgent need to develop an intelligent and automatic system in order to efficiently monitor crowds and identify abnormal activity.

Method: The existing method is incapable of extracting discriminative features from surveillance videos as pre-trained weights of different architectures were used. This paper develops a lightweight approach for accurately identifying violent activity in surveillance environments. As the first step of the proposed framework, a lightweight CNN model is trained on our own pilgrim's dataset to detect pilgrims from the surveillance cameras. These preprocessed salient frames are passed to a lightweight CNN model for spatial features extraction in the second step. In the third step, a Long Short Term Memory network (LSTM) is developed to extract temporal features. Finally, in the last step, in the case of violent activity or accidents, the proposed system will generate an alarm in real time to inform law enforcement agencies to take appropriate action, thus helping to avoid accidents and stampedes.

Results: We have conducted multiple experiments on two publicly available violent activity datasets, such as Surveillance Fight and Hockey Fight datasets; our proposed model achieved accuracies of 81.05 and 98.00, respectively.

Keywords: CCTV; CNN; Hajj pilgrims monitoring; LSTM; crowd monitoring; lightweight; violent activity recognition.

MeSH terms

  • Crowding
  • Humans
  • Memory, Long-Term
  • Neural Networks, Computer*
  • Recognition, Psychology*
  • Videotape Recording