Applications of deep convolutional neural networks to digitized natural history collections

Biodivers Data J. 2017 Nov 2:(5):e21139. doi: 10.3897/BDJ.5.e21139. eCollection 2017.

Abstract

Natural history collections contain data that are critical for many scientific endeavors. Recent efforts in mass digitization are generating large datasets from these collections that can provide unprecedented insight. Here, we present examples of how deep convolutional neural networks can be applied in analyses of imaged herbarium specimens. We first demonstrate that a convolutional neural network can detect mercury-stained specimens across a collection with 90% accuracy. We then show that such a network can correctly distinguish two morphologically similar plant families 96% of the time. Discarding the most challenging specimen images increases accuracy to 94% and 99%, respectively. These results highlight the importance of mass digitization and deep learning approaches and reveal how they can together deliver powerful new investigative tools.

Keywords: convolutional neural networks; deep learning; machine learning; mass digitization; natural history collections.