Weapon Violence Dataset 2.0: A synthetic dataset for violence detection

Data Brief. 2024 Apr 20:54:110448. doi: 10.1016/j.dib.2024.110448. eCollection 2024 Jun.

Abstract

In the current era, satisfying the appetite of data hungry models is becoming an increasingly challenging task. This challenge is particularly magnified in research areas characterised by sensitivity, where the quest for genuine data proves to be elusive. The study of violence serves as a poignant example, entailing ethical considerations and compounded by the scarcity of authentic, real-world data that is predominantly accessible only to law enforcement agencies. Existing datasets in this field often resort to using content from movies or open-source video platforms like YouTube, further emphasising the scarcity of authentic data. To address this, our dataset aims to pioneer a new approach by creating the first synthetic virtual dataset for violence detection, named the Weapon Violence Dataset (WVD). The dataset is generated by creating virtual violence scenarios inside the photo-realistic video game namely: Grand Theft Auto-V (GTA-V). This dataset includes carefully selected video clips of person-to-person fights captured from a frontal view, featuring various weapons-both hot and cold across different times of the day. Specifically, WVD contains three categories: Hot violence and Cold violence (representing the violence category) as well as No violence (constituting the control class). The dataset is designed and created in a way that will enable the research community to train deep models on such synthetic data with the ability to increase the data corpus if the needs arise. The dataset is publicly available on Kaggle and comprises normal RGB and optic flow videos.

Keywords: GTA-V; Hot and Cold weapons; Synthetic virtual violence; Violence detection; WVD.