DiversityScanner: Robotic handling of small invertebrates with machine learning methods

Mol Ecol Resour. 2022 May;22(4):1626-1638. doi: 10.1111/1755-0998.13567. Epub 2021 Dec 23.

Abstract

Invertebrate biodiversity remains poorly understood although it comprises much of the terrestrial animal biomass, most species and supplies many ecosystem services. The main obstacle is specimen-rich samples obtained with quantitative sampling techniques (e.g., Malaise trapping). Traditional sorting requires manual handling, while molecular techniques based on metabarcoding lose the association between individual specimens and sequences and thus struggle with obtaining precise abundance information. Here we present a sorting robot that prepares specimens from bulk samples for barcoding. It detects, images and measures individual specimens from a sample and then moves them into the wells of a 96-well microplate. We show that the images can be used to train convolutional neural networks (CNNs) that are capable of assigning the specimens to 14 insect taxa (usually families) that are particularly common in Malaise trap samples. The average assignment precision for all taxa is 91.4% (75%-100%). This ability of the robot to identify common taxa then allows for taxon-specific subsampling, because the robot can be instructed to only pick a prespecified number of specimens for abundant taxa. To obtain biomass information, the images are also used to measure specimen length and estimate body volume. We outline how the DiversityScanner can be a key component for tackling and monitoring invertebrate diversity by combining molecular and morphological tools: the images generated by the robot become training images for machine learning once they are labelled with taxonomic information from DNA barcodes. We suggest that a combination of automation, machine learning and DNA barcoding has the potential to tackle invertebrate diversity at an unprecedented scale.

Keywords: DNA barcoding; automation; biodiversity; biomass; convolutional neural network; “dark taxa”.

MeSH terms

  • Animals
  • Biodiversity
  • DNA Barcoding, Taxonomic / methods
  • Ecosystem
  • Humans
  • Invertebrates / genetics
  • Machine Learning
  • Robotic Surgical Procedures*
  • Robotics*