Swin Transformer and the Unet Architecture to Correct Motion Artifacts in Magnetic Resonance Image Reconstruction

Md Biddut Hossain; Rupali Kiran Shinde; Shariar Md Imtiaz; F M Fahmid Hossain; Seok-Hee Jeon; Ki-Chul Kwon; Nam Kim

doi:10.1155/2024/8972980

Swin Transformer and the Unet Architecture to Correct Motion Artifacts in Magnetic Resonance Image Reconstruction

Int J Biomed Imaging. 2024 May 2:2024:8972980. doi: 10.1155/2024/8972980. eCollection 2024.

Authors

Md Biddut Hossain¹, Rupali Kiran Shinde¹, Shariar Md Imtiaz¹, F M Fahmid Hossain¹, Seok-Hee Jeon², Ki-Chul Kwon¹, Nam Kim¹

Affiliations

¹ Department of Information and Communication Engineering, Chungbuk National University, Cheongju-si 28644, Chungcheongbuk-do, Republic of Korea.
² Department of Electronics Engineering, Incheon National University, 119 Academy-ro, Yeonsu-gu, Incheon 22012, Republic of Korea.

Abstract

We present a deep learning-based method that corrects motion artifacts and thus accelerates data acquisition and reconstruction of magnetic resonance images. The novel model, the Motion Artifact Correction by Swin Network (MACS-Net), uses a Swin transformer layer as the fundamental block and the Unet architecture as the neural network backbone. We employ a hierarchical transformer with shifted windows to extract multiscale contextual features during encoding. A new dual upsampling technique is employed to enhance the spatial resolutions of feature maps in the Swin transformer-based decoder layer. A raw magnetic resonance imaging dataset is used for network training and testing; the data contain various motion artifacts with ground truth images of the same subjects. The results were compared to six state-of-the-art MRI image motion correction methods using two types of motions. When motions were brief (within 5 s), the method reduced the average normalized root mean square error (NRMSE) from 45.25% to 17.51%, increased the mean structural similarity index measure (SSIM) from 79.43% to 91.72%, and increased the peak signal-to-noise ratio (PSNR) from 18.24 to 26.57 dB. Similarly, when motions were extended from 5 to 10 s, our approach decreased the average NRMSE from 60.30% to 21.04%, improved the mean SSIM from 33.86% to 90.33%, and increased the PSNR from 15.64 to 24.99 dB. The anatomical structures of the corrected images and the motion-free brain data were similar.