Improving deep learning-based segmentation of diatoms in gigapixel-sized virtual slides by object-based tile positioning and object integrity constraint

PLoS One. 2023 Feb 24;18(2):e0272103. doi: 10.1371/journal.pone.0272103. eCollection 2023.

Abstract

Diatoms represent one of the morphologically and taxonomically most diverse groups of microscopic eukaryotes. Light microscopy-based taxonomic identification and enumeration of frustules, the silica shells of these microalgae, is broadly used in aquatic ecology and biomonitoring. One key step in emerging digital variants of such investigations is segmentation, a task that has been addressed before, but usually in manually captured megapixel-sized images of individual diatom cells with a mostly clean background. In this paper, we applied deep learning-based segmentation methods to gigapixel-sized, high-resolution scans of diatom slides with a realistically cluttered background. This setup requires large slide scans to be subdivided into small images (tiles) to apply a segmentation model to them. This subdivision (tiling), when done using a sliding window approach, often leads to cropping relevant objects at the boundaries of individual tiles. We hypothesized that in the case of diatom analysis, reducing the amount of such cropped objects in the training data can improve segmentation performance by allowing for a better discrimination of relevant, intact frustules or valves from small diatom fragments, which are considered irrelevant when counting diatoms. We tested this hypothesis by comparing a standard sliding window / fixed-stride tiling approach with two new approaches we term object-based tile positioning with and without object integrity constraint. With all three tiling approaches, we trained Mask-R-CNN and U-Net models with different amounts of training data and compared their performance. Object-based tiling with object integrity constraint led to an improvement in pixel-based precision by 12-17 percentage points without substantially impairing recall when compared with standard sliding window tiling. We thus propose that training segmentation models with object-based tiling schemes can improve diatom segmentation from large gigapixel-sized images but could potentially also be relevant for other image domains.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning*
  • Diatoms*
  • Image Processing, Computer-Assisted / methods
  • Microscopy

Grants and funding

This work was funded by the Deutsche Forschungsgemeinschaft (DFG) in the framework of the priority programme SPP 1991 Taxon-OMICS under grant nrs. BE4316/7-1 & NA 731/9-1. D.L.'s contribution was supported by the German Federal Ministry for Economic Affairs and Energy (BMWi) (FKZ: 0324254D). BIIGLE is supported by the BMBF-funded de.NBI Cloud within the German Network for Bioinformatics Infrastructure (de.NBI) (031A537B, 031A533A, 031A538A, 031A533B, 031A535A, 031A537C, 031A534A, 031A532B). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.