Deformable motion compensation in interventional cone-beam CT with a context-aware learned autofocus metric

Med Phys. 2024 May 11. doi: 10.1002/mp.17125. Online ahead of print.

Abstract

Purpose: Interventional Cone-Beam CT (CBCT) offers 3D visualization of soft-tissue and vascular anatomy, enabling 3D guidance of abdominal interventions. However, its long acquisition time makes CBCT susceptible to patient motion. Image-based autofocus offers a suitable platform for compensation of deformable motion in CBCT, but it relies on handcrafted motion metrics based on first-order image properties and that lack awareness of the underlying anatomy. This work proposes a data-driven approach to motion quantification via a learned, context-aware, deformable metric, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ , that quantifies the amount of motion degradation as well as the realism of the structural anatomical content in the image.

Methods: The proposed VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ was modeled as a deep convolutional neural network (CNN) trained to recreate a reference-based structural similarity metric-visual information fidelity (VIF). The deep CNN acted on motion-corrupted images, providing an estimation of the spatial VIF map that would be obtained against a motion-free reference, capturing motion distortion, and anatomic plausibility. The deep CNN featured a multi-branch architecture with a high-resolution branch for estimation of voxel-wise VIF on a small volume of interest. A second contextual, low-resolution branch provided features associated to anatomical context for disentanglement of motion effects and anatomical appearance. The deep CNN was trained on paired motion-free and motion-corrupted data obtained with a high-fidelity forward projection model for a protocol involving 120 kV and 9.90 mGy. The performance of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ was evaluated via metrics of correlation with ground truth VIF ${\bm{VIF}}$ and with the underlying deformable motion field in simulated data with deformable motion fields with amplitude ranging from 5 to 20 mm and frequency from 2.4 up to 4 cycles/scan. Robustness to variation in tissue contrast and noise levels was assessed in simulation studies with varying beam energy (90-120 kV) and dose (1.19-39.59 mGy). Further validation was obtained on experimental studies with a deformable phantom. Final validation was obtained via integration of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ on an autofocus compensation framework, applied to motion compensation on experimental datasets and evaluated via metric of spatial resolution on soft-tissue boundaries and sharpness of contrast-enhanced vascularity.

Results: The magnitude and spatial map of VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ showed consistent and high correlation levels with the ground truth in both simulation and real data, yielding average normalized cross correlation (NCC) values of 0.95 and 0.88, respectively. Similarly, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ achieved good correlation values with the underlying motion field, with average NCC of 0.90. In experimental phantom studies, VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ properly reflects the change in motion amplitudes and frequencies: voxel-wise averaging of the local VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ across the full reconstructed volume yielded an average value of 0.69 for the case with mild motion (2 mm, 12 cycles/scan) and 0.29 for the case with severe motion (12 mm, 6 cycles/scan). Autofocus motion compensation using VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ resulted in noticeable mitigation of motion artifacts and improved spatial resolution of soft tissue and high-contrast structures, resulting in reduction of edge spread function width of 8.78% and 9.20%, respectively. Motion compensation also increased the conspicuity of contrast-enhanced vascularity, reflected in an increase of 9.64% in vessel sharpness.

Conclusion: The proposed VI F D L ${\bm{VI}}{{\bm{F}}}_{DL}$ , featuring a novel context-aware architecture, demonstrated its capacity as a reference-free surrogate of structural similarity to quantify motion-induced degradation of image quality and anatomical plausibility of image content. The validation studies showed robust performance across motion patterns, x-ray techniques, and anatomical instances. The proposed anatomy- and context-aware metric poses a powerful alternative to conventional motion estimation metrics, and a step forward for application of deep autofocus motion compensation for guidance in clinical interventional procedures.

Keywords: deep autofocus; deformable motion; interventional CBCT.