Preprocessing for Keypoint-Based Sign Language Translation without Glosses

Youngmin Kim; Hyeongboo Baek

doi:10.3390/s23063231

Preprocessing for Keypoint-Based Sign Language Translation without Glosses

Sensors (Basel). 2023 Mar 17;23(6):3231. doi: 10.3390/s23063231.

Authors

Youngmin Kim¹, Hyeongboo Baek¹

Affiliation

¹ Department of Computer Science and Engineering, Incheon National University (INU), Incheon 22012, Republic of Korea.

Abstract

While machine translation for spoken language has advanced significantly, research on sign language translation (SLT) for deaf individuals remains limited. Obtaining annotations, such as gloss, can be expensive and time-consuming. To address these challenges, we propose a new sign language video-processing method for SLT without gloss annotations. Our approach leverages the signer's skeleton points to identify their movements and help build a robust model resilient to background noise. We also introduce a keypoint normalization process that preserves the signer's movements while accounting for variations in body length. Furthermore, we propose a stochastic frame selection technique to prioritize frames to minimize video information loss. Based on the attention-based model, our approach demonstrates effectiveness through quantitative experiments on various metrics using German and Korean sign language datasets without glosses.

Keywords: computer vision; deep learning; sign language translation; video processing.

Grants and funding

2021/Incheon National University