Multi-scale feature aggregation and fusion network with self-supervised multi-level perceptual loss for textures preserving low-dose CT denoising

Phys Med Biol. 2024 Apr 26;69(10). doi: 10.1088/1361-6560/ad3c91.

Abstract

Objective. The textures and detailed structures in computed tomography (CT) images are highly desirable for clinical diagnosis. This study aims to expand the current body of work on textures and details preserving convolutional neural networks for low-dose CT (LDCT) image denoising task.Approach. This study proposed a novel multi-scale feature aggregation and fusion network (MFAF-net) for LDCT image denoising. Specifically, we proposed a multi-scale residual feature aggregation module to characterize multi-scale structural information in CT images, which captures regional-specific inter-scale variations using learned weights. We further proposed a cross-level feature fusion module to integrate cross-level features, which adaptively weights the contributions of features from encoder to decoder by using a spatial pyramid attention mechanism. Moreover, we proposed a self-supervised multi-level perceptual loss module to generate multi-level auxiliary perceptual supervision for recovery of salient textures and structures of tissues and lesions in CT images, which takes advantage of abundant semantic information at various levels. We introduced parameters for the perceptual loss to adaptively weight the contributions of auxiliary features of different levels and we also introduced an automatic parameter tuning strategy for these parameters.Main results. Extensive experimental studies were performed to validate the effectiveness of the proposed method. Experimental results demonstrate that the proposed method can achieve better performance on both fine textures preservation and noise suppression for CT image denoising task compared with other competitive convolutional neural network (CNN) based methods.Significance. The proposed MFAF-net takes advantage of multi-scale receptive fields, cross-level features integration and self-supervised multi-level perceptual loss, enabling more effective recovering of fine textures and detailed structures of tissues and lesions in CT images.

Keywords: deep learning; image denoising; low-dose CT; multi-level perceptual loss; multi-scale convolutional neural networks.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Image Processing, Computer-Assisted* / methods
  • Neural Networks, Computer
  • Radiation Dosage
  • Signal-To-Noise Ratio
  • Tomography, X-Ray Computed* / methods