The VISTA datasets, a combination of inertial sensors and depth cameras data for activity recognition

Laura Fiorini; Federica Gabriella Cornacchia Loizzo; Alessandra Sorrentino; Erika Rovini; Alessandro Di Nuovo; Filippo Cavallo

doi:10.1038/s41597-022-01324-3

The VISTA datasets, a combination of inertial sensors and depth cameras data for activity recognition

Sci Data. 2022 May 18;9(1):218. doi: 10.1038/s41597-022-01324-3.

Authors

Laura Fiorini^#¹, Federica Gabriella Cornacchia Loizzo^#², Alessandra Sorrentino², Erika Rovini³, Alessandro Di Nuovo⁴, Filippo Cavallo^{3

2}

Affiliations

¹ University of Florence, Department of Industrial Engineering, Florence, 56139, Italy. laura.fiorini@unifi.it.
² Scuola Superiore Sant'Anna, The BioRobotics Institute, Pontedera, PI, 56025, Italy.
³ University of Florence, Department of Industrial Engineering, Florence, 56139, Italy.
⁴ Sheffield Hallam University, Sheffield Robotics and the Department of Computing, Sheffield, United Kingdom.

^# Contributed equally.

Abstract

This paper makes the VISTA database, composed of inertial and visual data, publicly available for gesture and activity recognition. The inertial data were acquired with the SensHand, which can capture the movement of wrist, thumb, index and middle fingers, while the RGB-D visual data were acquired simultaneously from two different points of view, front and side. The VISTA database was acquired in two experimental phases: in the former, the participants have been asked to perform 10 different actions; in the latter, they had to execute five scenes of daily living, which corresponded to a combination of the actions of the selected actions. In both phase, Pepper interacted with participants. The two camera point of views mimic the different point of view of pepper. Overall, the dataset includes 7682 action instances for the training phase and 3361 action instances for the testing phase. It can be seen as a framework for future studies on artificial intelligence techniques for activity recognition, including inertial-only data, visual-only data, or a sensor fusion approach.

Publication types

Dataset

MeSH terms

Algorithms*
Artificial Intelligence
Gestures
Humans
Movement*
Wrist

Abstract

Publication types

MeSH terms

Grants and funding