Multi modality fusion transformer with spatio-temporal feature aggregation module for psychiatric disorder diagnosis

Comput Med Imaging Graph. 2024 Jun:114:102368. doi: 10.1016/j.compmedimag.2024.102368. Epub 2024 Mar 19.

Abstract

Bipolar disorder (BD) is characterized by recurrent episodes of depression and mild mania. In this paper, to address the common issue of insufficient accuracy in existing methods and meet the requirements of clinical diagnosis, we propose a framework called Spatio-temporal Feature Fusion Transformer (STF2Former). It improves on our previous work - MFFormer by introducing a Spatio-temporal Feature Aggregation Module (STFAM) to learn the temporal and spatial features of rs-fMRI data. It promotes intra-modality attention and information fusion across different modalities. Specifically, this method decouples the temporal and spatial dimensions and designs two feature extraction modules for extracting temporal and spatial information separately. Extensive experiments demonstrate the effectiveness of our proposed STFAM in extracting features from rs-fMRI, and prove that our STF2Former can significantly outperform MFFormer and achieve much better results among other state-of-the-art methods.

Keywords: Bipolar disorder; Magnetic resonance imaging; Medical diagnosis; Multimodal deep learning; Spatio-temporal feature aggregation module.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Learning*
  • Mental Disorders*