Multimodal interaction: Input-output modality combinations for identification tasks in augmented reality

Appl Ergon. 2022 Nov:105:103842. doi: 10.1016/j.apergo.2022.103842. Epub 2022 Jul 19.

Abstract

Multimodal interaction (MMI) is being widely implemented, especially in new technologies such as augmented reality (AR) systems since it is presumed to support a more natural, efficient, and flexible form of interaction. However, limited research has been done to investigate the proper application of MMI in AR. More specifically, the effects of combining different input and output modalities during MMI in AR are still not fully understood. Therefore, this study aims to examine the independent and combined effects of different input and output modalities during a typical AR task. 20 young adults participated in a controlled experiment in which they were asked to perform a simple identification task using an AR device in different input (speech, gesture, multimodal) and output (VV-VA, VV-NA, NV-VA, NV-NA) conditions. Results showed that there were differences in the influence of input and output modalities on task performance, workload, perceived appropriateness, and user preference. Interaction effects between the input and output conditions on the performance metrics were also evident in this study, suggesting that although multimodal input is generally preferred by the users, it should be implemented with caution since its effectiveness is highly influenced by the processing code of the system output. This study, which is the first of its kind, has revealed several new implications regarding the application of MMI in AR systems.

Keywords: Augmented reality; Modality combination; Multimodal interaction; Processing codes; Sensory modalities.