Multimodal interaction: Input-output modality combinations for identification tasks in augmented reality

May Jorella Lazaro; Jaeyong Lee; Jaemin Chun; Myung Hwan Yun; Sungho Kim

doi:10.1016/j.apergo.2022.103842

Multimodal interaction: Input-output modality combinations for identification tasks in augmented reality

Appl Ergon. 2022 Nov:105:103842. doi: 10.1016/j.apergo.2022.103842. Epub 2022 Jul 19.

Authors

May Jorella Lazaro¹, Jaeyong Lee², Jaemin Chun², Myung Hwan Yun³, Sungho Kim⁴

Affiliations

¹ Interdisciplinary Program in Cognitive Science, Seoul National University, Seoul, South Korea.
² Samsung Electronics Co, Ltd, Seoul, South Korea.
³ Department of Industrial Engineering & Institute for Industrial System Innovation, Seoul National University, Seoul, South Korea. Electronic address: mhy@snu.ac.kr.
⁴ Department of Systems Engineering, Republic of Korea Air Force Academy, Cheongju, South Korea. Electronic address: sunghokim1123@gmail.com.

PMID: 35868052
DOI: 10.1016/j.apergo.2022.103842

Abstract

Multimodal interaction (MMI) is being widely implemented, especially in new technologies such as augmented reality (AR) systems since it is presumed to support a more natural, efficient, and flexible form of interaction. However, limited research has been done to investigate the proper application of MMI in AR. More specifically, the effects of combining different input and output modalities during MMI in AR are still not fully understood. Therefore, this study aims to examine the independent and combined effects of different input and output modalities during a typical AR task. 20 young adults participated in a controlled experiment in which they were asked to perform a simple identification task using an AR device in different input (speech, gesture, multimodal) and output (VV-VA, VV-NA, NV-VA, NV-NA) conditions. Results showed that there were differences in the influence of input and output modalities on task performance, workload, perceived appropriateness, and user preference. Interaction effects between the input and output conditions on the performance metrics were also evident in this study, suggesting that although multimodal input is generally preferred by the users, it should be implemented with caution since its effectiveness is highly influenced by the processing code of the system output. This study, which is the first of its kind, has revealed several new implications regarding the application of MMI in AR systems.

Keywords: Augmented reality; Modality combination; Multimodal interaction; Processing codes; Sensory modalities.