Video Classification of Cloth Simulations: Deep Learning and Position-Based Dynamics for Stiffness Prediction

Makara Mao; Hongly Va; Min Hong

doi:10.3390/s24020549

Video Classification of Cloth Simulations: Deep Learning and Position-Based Dynamics for Stiffness Prediction

Sensors (Basel). 2024 Jan 15;24(2):549. doi: 10.3390/s24020549.

Authors

Makara Mao¹, Hongly Va¹, Min Hong²

Affiliations

¹ Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea.
² Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea.

Abstract

In virtual reality, augmented reality, or animation, the goal is to represent the movement of deformable objects in the real world as similar as possible in the virtual world. Therefore, this paper proposed a method to automatically extract cloth stiffness values from video scenes, and then they are applied as material properties for virtual cloth simulation. We propose the use of deep learning (DL) models to tackle this issue. The Transformer model, in combination with pre-trained architectures like DenseNet121, ResNet50, VGG16, and VGG19, stands as a leading choice for video classification tasks. Position-Based Dynamics (PBD) is a computational framework widely used in computer graphics and physics-based simulations for deformable entities, notably cloth. It provides an inherently stable and efficient way to replicate complex dynamic behaviors, such as folding, stretching, and collision interactions. Our proposed model characterizes virtual cloth based on softness-to-stiffness labels and accurately categorizes videos using this labeling. The cloth movement dataset utilized in this research is derived from a meticulously designed stiffness-oriented cloth simulation. Our experimental assessment encompasses an extensive dataset of 3840 videos, contributing to a multi-label video classification dataset. Our results demonstrate that our proposed model achieves an impressive average accuracy of 99.50%. These accuracies significantly outperform alternative models such as RNN, GRU, LSTM, and Transformer.

Keywords: Transformer; cloth simulation; deep learning; multi-label; position-based dynamics; video classification.

Abstract

Grants and funding