Sign Language Dataset for Automatic Motion Generation

María Villa-Monedero; Manuel Gil-Martín; Daniel Sáez-Trigueros; Andrzej Pomirski; Rubén San-Segundo

doi:10.3390/jimaging9120262

Sign Language Dataset for Automatic Motion Generation

J Imaging. 2023 Nov 27;9(12):262. doi: 10.3390/jimaging9120262.

Authors

María Villa-Monedero¹, Manuel Gil-Martín¹, Daniel Sáez-Trigueros², Andrzej Pomirski³, Rubén San-Segundo¹

Affiliations

¹ Grupo de Tecnología del Habla y Aprendizaje Automático (T.H.A.U. Group), Information Processing and Telecommunications Center, E.T.S.I. de Telecomunicación, Universidad Politécnica de Madrid, 28040 Madrid, Spain.
² Alexa AI, C. de Ramírez de Prado, 5, 28045 Madrid, Spain.
³ Alexa AI, Aleja Grunwaldzka 472, 80-309 Gdańsk, Poland.

Abstract

Several sign language datasets are available in the literature. Most of them are designed for sign language recognition and translation. This paper presents a new sign language dataset for automatic motion generation. This dataset includes phonemes for each sign (specified in HamNoSys, a transcription system developed at the University of Hamburg, Hamburg, Germany) and the corresponding motion information. The motion information includes sign videos and the sequence of extracted landmarks associated with relevant points of the skeleton (including face, arms, hands, and fingers). The dataset includes signs from three different subjects in three different positions, performing 754 signs including the entire alphabet, numbers from 0 to 100, numbers for hour specification, months, and weekdays, and the most frequent signs used in Spanish Sign Language (LSE). In total, there are 6786 videos and their corresponding phonemes (HamNoSys annotations). From each video, a sequence of landmarks was extracted using MediaPipe. The dataset allows training an automatic system for motion generation from sign language phonemes. This paper also presents preliminary results in motion generation from sign phonemes obtaining a Dynamic Time Warping distance per frame of 0.37.

Keywords: HamNoSys; landmarks extraction; motion dataset; sign language; sign phonemes.

Grants and funding

M. Villa-Monedero’s scholarship has been supported by Amazon through the IPTC-Amazon collaboration.