Dosimetry-Driven Quality Measure of Brain Pseudo Computed Tomography Generated From Deep Learning for MRI-Only Radiation Therapy Treatment Planning

Int J Radiat Oncol Biol Phys. 2020 Nov 1;108(3):813-823. doi: 10.1016/j.ijrobp.2020.05.006. Epub 2020 May 14.

Abstract

Purpose: This study aims to evaluate the impact of key parameters on the pseudo computed tomography (pCT) quality generated from magnetic resonance imaging (MRI) with a 3-dimensional (3D) convolutional neural network.

Methods and materials: Four hundred two brain tumor cases were retrieved, yielding associations between 182 computed tomography (CT) and T1-weighted MRI (T1) scans, 180 CT and contrast-enhanced T1-weighted MRI (T1-Gd) scans, and 40 CT, T1, and T1-Gd scans. A 3D CNN was used to map T1 or T1-Gd onto CT scans and evaluate the importance of different components. First, the training set size's influence on testing set accuracy was assessed. Moreover, we evaluated the MRI sequence impact, using T1-only and T1-Gd-only cohorts. We then investigated 4 MRI standardization approaches (histogram-based, zero-mean/unit-variance, white stripe, and no standardization) based on training, validation, and testing cohorts composed of 242, 81, and 79 patients cases, respectively, as well as a bias field correction influence. Finally, 2 networks, namely HighResNet and 3D UNet, were compared to evaluate the architecture's impact on the pCT quality. The mean absolute error, gamma indices, and dose-volume histograms were used as evaluation metrics.

Results: Generating models using all the available cases for training led to higher pCT quality. The T1 and T1-Gd models had a maximum difference in gamma index means of 0.07 percentage point. The mean absolute error obtained with white stripe was 78 ± 22 Hounsfield units, which slightly outperformed histogram-based, zero-mean/unit-variance, and no standardization (P < .0001). Regarding the network architectures, 3%/3 mm gamma indices of 99.83% ± 0.19% and 99.74% ± 0.24% were obtained for HighResNet and 3D UNet, respectively.

Conclusions: Our best pCTs were generated using more than 200 samples in the training data set. Training with T1 only and T1-Gd only did not significantly affect performance. Regardless of the preprocessing applied, the dosimetry quality remained equivalent and relevant for potential use in clinical practice.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brain / diagnostic imaging
  • Brain Neoplasms / diagnostic imaging*
  • Brain Neoplasms / radiotherapy
  • Contrast Media
  • Deep Learning*
  • Humans
  • Magnetic Resonance Imaging / methods*
  • Magnetic Resonance Imaging / standards
  • Neural Networks, Computer
  • Radiometry
  • Radiotherapy / standards
  • Retrospective Studies
  • Skull / diagnostic imaging
  • Tomography, X-Ray Computed / methods*

Substances

  • Contrast Media