Accurate segmentation of head and neck radiotherapy CT scans with 3D CNNs: consistency is key

Phys Med Biol. 2023 Apr 3;68(8). doi: 10.1088/1361-6560/acc309.

Abstract

Objective.Automatic segmentation of organs-at-risk in radiotherapy planning computed tomography (CT) scans using convolutional neural networks (CNNs) is an active research area. Very large datasets are usually required to train such CNN models. In radiotherapy, large, high-quality datasets are scarce and combining data from several sources can reduce the consistency of training segmentations. It is therefore important to understand the impact of training data quality on the performance of auto-segmentation models for radiotherapy.Approach.In this study, we took an existing 3D CNN architecture for head and neck CT auto-segmentation and compare the performance of models trained with a small, well-curated dataset (n= 34) and then a far larger dataset (n= 185) containing less consistent training segmentations. We performed 5-fold cross-validations in each dataset and tested segmentation performance using the 95th percentile Hausdorff distance and mean distance-to-agreement metrics. Finally, we validated the generalisability of our models with an external cohort of patient data (n= 12) with five expert annotators.Main results.The models trained with a large dataset were greatly outperformed by models (of identical architecture) trained with a smaller, but higher consistency set of training samples. Our models trained with a small dataset produce segmentations of similar accuracy as expert human observers and generalised well to new data, performing within inter-observer variation.Significance.We empirically demonstrate the importance of highly consistent training samples when training a 3D auto-segmentation model for use in radiotherapy. Crucially, it is the consistency of the training segmentations which had a greater impact on model performance rather than the size of the dataset used.

Keywords: 3D auto-segmentation; convolutional neural network; effective supervised learning; medical image analysis; small dataset; training annotation consistency.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Head*
  • Humans
  • Image Processing, Computer-Assisted* / methods
  • Neck
  • Neural Networks, Computer
  • Tomography, X-Ray Computed