AF: An Association-Based Fusion Method for Multi-Modal Classification

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9236-9254. doi: 10.1109/TPAMI.2021.3125995. Epub 2022 Nov 7.

Abstract

Multi-modal classification (MMC) aims to integrate the complementary information from different modalities to improve classification performance. Existing MMC methods can be grouped into two categories: traditional methods and deep learning-based methods. The traditional methods often implement fusion in a low-level original space. Besides, they mostly focus on the inter-modal fusion and neglect the intra-modal fusion. Thus, the representation capacity of fused features induced by them is insufficient. The deep learning-based methods implement the fusion in a high-level feature space where the associations among features are considered, while the whole process is implicit and the fused space lacks interpretability. Based on these observations, we propose a novel interpretative association-based fusion method for MMC, named AF. In AF, both the association information and the high-order information extracted from feature space are simultaneously encoded into a new feature space to help to train an MMC model in an explicit manner. Moreover, AF is a general fusion framework, and most existing MMC methods can be embedded into it to improve their performance. Finally, the effectiveness and the generality of AF are validated on 22 datasets, four typically traditional MMC methods adopting best modality, early, late and model fusion strategies and a deep learning-based MMC method.