Memory-efficient 2.5D convolutional transformer networks for multi-modal deformable registration with weak label supervision applied to whole-heart CT and MRI scans

Alessa Hering; Sven Kuckertz; Stefan Heldmann; Mattias P Heinrich

doi:10.1007/s11548-019-02068-z

Memory-efficient 2.5D convolutional transformer networks for multi-modal deformable registration with weak label supervision applied to whole-heart CT and MRI scans

Int J Comput Assist Radiol Surg. 2019 Nov;14(11):1901-1912. doi: 10.1007/s11548-019-02068-z. Epub 2019 Sep 19.

Authors

Alessa Hering^{1

2}, Sven Kuckertz³, Stefan Heldmann³, Mattias P Heinrich⁴

Affiliations

¹ Fraunhofer Institute for Digital Medicine MEVIS, Maria-Goeppert-Str. 3, 23562, Lübeck, Germany. alessa.hering@mevis.fraunhofer.de.
² Diagnostic Image Analysis Group, Radboudumc, Geert Grooteplein 10, 6525 GA, Nijmegen, Netherlands. alessa.hering@mevis.fraunhofer.de.
³ Fraunhofer Institute for Digital Medicine MEVIS, Maria-Goeppert-Str. 3, 23562, Lübeck, Germany.
⁴ Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562, Lübeck, Germany.

PMID: 31538274
DOI: 10.1007/s11548-019-02068-z

Abstract

PURPOSE : Despite its potential for improvements through supervision, deep learning-based registration approaches are difficult to train for large deformations in 3D scans due to excessive memory requirements. METHODS : We propose a new 2.5D convolutional transformer architecture that enables us to learn a memory-efficient weakly supervised deep learning model for multi-modal image registration. Furthermore, we firstly integrate a volume change control term into the loss function of a deep learning-based registration method to penalize occurring foldings inside the deformation field. RESULTS : Our approach succeeds at learning large deformations across multi-modal images. We evaluate our approach on 100 pair-wise registrations of CT and MRI whole-heart scans and demonstrate considerably higher Dice Scores (of 0.74) compared to a state-of-the-art unsupervised discrete registration framework (deeds with Dice of 0.71). CONCLUSION : Our proposed memory-efficient registration method performs better than state-of-the-art conventional registration methods. By using a volume change control term in the loss function, the number of occurring foldings can be considerably reduced on new registration cases.

Keywords: 2.5D; CT; Convolutional neural networks; MRI; Multi-modal registration; Weakly supervised learning.

MeSH terms

Deep Learning*
Equipment Design
Heart / diagnostic imaging*
Humans
Magnetic Resonance Imaging / instrumentation*
Neural Networks, Computer*
Phantoms, Imaging*
Tomography, X-Ray Computed / instrumentation*

Grants and funding

320997906/Deutsche Forschungsgemeinschaft