Towards automation of dynamic-gaze video analysis taking functional upper-limb tasks as a case study

Musa Alyaman; Mohammad Sobuh; Alaa Abu Zaid; Laurence Kenney; Adam J Galpin; Majid A Al-Taee

doi:10.1016/j.cmpb.2021.106041

Towards automation of dynamic-gaze video analysis taking functional upper-limb tasks as a case study

Comput Methods Programs Biomed. 2021 May:203:106041. doi: 10.1016/j.cmpb.2021.106041. Epub 2021 Mar 7.

Authors

Musa Alyaman¹, Mohammad Sobuh², Alaa Abu Zaid³, Laurence Kenney⁴, Adam J Galpin⁴, Majid A Al-Taee⁵

Affiliations

¹ Mechatronics Engineering Department, School of Engineering, The University of Jordan, Amman, 11942, Jordan. Electronic address: m.alyaman@ju.edu.jo.
² Department of Orthotics & Prosthetics, School of Rehabilitation Sciences. The University of Jordan, Amman, 11942, Jordan.
³ Mechatronics Engineering Department, School of Engineering, The University of Jordan, Amman, 11942, Jordan.
⁴ School of Health and Society, University of Salford, Manchester M5 4WT, UK.
⁵ School of Electrical Engineering, Electronics and Computer Science, University of Liverpool, Liverpool L69 3BX, UK.

PMID: 33756186
DOI: 10.1016/j.cmpb.2021.106041

Abstract

Background and objective: Previous studies in motor control have yielded clear evidence that gaze behavior (where someone looks) quantifies the attention paid to perform actions. However, eliciting clinically meaningful results from the gaze data has been done manually, rendering it incredibly tedious, time-consuming, and highly subjective. This paper aims to study the feasibility of automating the coding process of the gaze data taking functional upper-limb tasks as a case study.

Methods: This is achieved by developing a new algorithm capable of coding the collected gaze data through three main stages; data preparation, data processing, and output generation. The input data in the form of a crosshair and a gaze video are converted into a 25 Hz frame rate sequence. Keyframes and non-key frames are then obtained and processed using a combination of image processing techniques and a fuzzy logic controller. In each trial, the location and duration of gaze fixation at the areas of interest (AOIs) are obtained. Once the gaze data is coded, it can be presented in different forms and formats, including the stacked color bar.

Results: The obtained results showed that the developed coding algorithm highly agrees with the manual coding method but significantly faster and less prone to unsystematic errors. Statistical analysis showed that Cohen's Kappa ranges from 0.705 to 1.0. Moreover, based on the intra-class correlation coefficient (ICC), the agreement index between computerized and manual coding methods is found to be (i) 0.908 with 95% confidence intervals (0.867, 0.937) for the anatomical hand and (ii) 0.923 with 95% confidence intervals (0.888, 0.948) for the prosthetic hand. A Bland-Altman plot also showed that all data points are closely scattered around the mean. These findings confirm the validity and effectiveness of the developed coding algorithm.

Conclusion: The developed algorithm demonstrated that it is feasible to automate the coding of the gaze data, reduce the coding time, and improve the coding process's reliability.

Keywords: Fixation duration; Fuzzy logic; Gaze tracking; Image processing; Upper limb tasks; Video analysis.

MeSH terms

Automation
Fixation, Ocular*
Hand
Image Processing, Computer-Assisted*
Reproducibility of Results