An annotated fluorescence image dataset for training nuclear segmentation methods

Sci Data. 2020 Aug 11;7(1):262. doi: 10.1038/s41597-020-00608-w.

Abstract

Fully-automated nuclear image segmentation is the prerequisite to ensure statistically significant, quantitative analyses of tissue preparations,applied in digital pathology or quantitative microscopy. The design of segmentation methods that work independently of the tissue type or preparation is complex, due to variations in nuclear morphology, staining intensity, cell density and nuclei aggregations. Machine learning-based segmentation methods can overcome these challenges, however high quality expert-annotated images are required for training. Currently, the limited number of annotated fluorescence image datasets publicly available do not cover a broad range of tissues and preparations. We present a comprehensive, annotated dataset including tightly aggregated nuclei of multiple tissues for the training of machine learning-based nuclear segmentation algorithms. The proposed dataset covers sample preparation methods frequently used in quantitative immunofluorescence microscopy. We demonstrate the heterogeneity of the dataset with respect to multiple parameters such as magnification, modality, signal-to-noise ratio and diagnosis. Based on a suggested split into training and test sets and additional single-nuclei expert annotations, machine learning-based image segmentation methods can be trained and evaluated.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Fluorescence*
  • Humans
  • Image Processing, Computer-Assisted*
  • Machine Learning*
  • Microscopy, Fluorescence*