Processing citizen science- and machine-annotated time-lapse imagery for biologically meaningful metrics

Sci Data. 2020 Mar 26;7(1):102. doi: 10.1038/s41597-020-0442-6.

Abstract

Time-lapse cameras facilitate remote and high-resolution monitoring of wild animal and plant communities, but the image data produced require further processing to be useful. Here we publish pipelines to process raw time-lapse imagery, resulting in count data (number of penguins per image) and 'nearest neighbour distance' measurements. The latter provide useful summaries of colony spatial structure (which can indicate phenological stage) and can be used to detect movement - metrics which could be valuable for a number of different monitoring scenarios, including image capture during aerial surveys. We present two alternative pathways for producing counts: (1) via the Zooniverse citizen science project Penguin Watch and (2) via a computer vision algorithm (Pengbot), and share a comparison of citizen science-, machine learning-, and expert- derived counts. We provide example files for 14 Penguin Watch cameras, generated from 63,070 raw images annotated by 50,445 volunteers. We encourage the use of this large open-source dataset, and the associated processing methodologies, for both ecological studies and continued machine learning and computer vision development.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Citizen Science*
  • Image Processing, Computer-Assisted*
  • Machine Learning*
  • Spheniscidae
  • Time-Lapse Imaging*