Effective Techniques for Multimodal Data Fusion: A Comparative Analysis

Maciej Pawłowski; Anna Wróblewska; Sylwia Sysko-Romańczuk

doi:10.3390/s23052381

Effective Techniques for Multimodal Data Fusion: A Comparative Analysis

Sensors (Basel). 2023 Feb 21;23(5):2381. doi: 10.3390/s23052381.

Authors

Maciej Pawłowski¹, Anna Wróblewska^{1

2}, Sylwia Sysko-Romańczuk³

Affiliations

¹ Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa Street 75, 00-662 Warsaw, Poland.
² WeSub, Adama Branickiego Street 17, 02-972 Warsaw, Poland.
³ Faculty of Management, Warsaw University of Technology, Narbutta Street 85, 02-524 Warsaw, Poland.

Abstract

Data processing in robotics is currently challenged by the effective building of multimodal and common representations. Tremendous volumes of raw data are available and their smart management is the core concept of multimodal learning in a new paradigm for data fusion. Although several techniques for building multimodal representations have been proven successful, they have not yet been analyzed and compared in a given production setting. This paper explored three of the most common techniques, (1) the late fusion, (2) the early fusion, and (3) the sketch, and compared them in classification tasks. Our paper explored different types of data (modalities) that could be gathered by sensors serving a wide range of sensor applications. Our experiments were conducted on Amazon Reviews, MovieLens25M, and Movie-Lens1M datasets. Their outcomes allowed us to confirm that the choice of fusion technique for building multimodal representation is crucial to obtain the highest possible model performance resulting from the proper modality combination. Consequently, we designed criteria for choosing this optimal data fusion technique.

Keywords: comparative analysis; data fusion; deep learning in sensor systems; multimodal learning; multimodal representation; neural networks.

Abstract

Grants and funding