Toward Joint Acquisition-Annotation of Images with Egocentric Devices for a Lower-Cost Machine Learning Application to Apple Detection

Salma Samiei; Pejman Rasti; Paul Richard; Gilles Galopin; David Rousseau

doi:10.3390/s20154173

Toward Joint Acquisition-Annotation of Images with Egocentric Devices for a Lower-Cost Machine Learning Application to Apple Detection

Sensors (Basel). 2020 Jul 27;20(15):4173. doi: 10.3390/s20154173.

Authors

Salma Samiei^{1

2}, Pejman Rasti^{1

3}, Paul Richard¹, Gilles Galopin², David Rousseau^{1

2}

Affiliations

¹ Laboratoire Angevin de Recherche en Ingénierie des Systèmes (LARIS), Université d'Angers, 62 Avenue Notre Dame du Lac, 49035 Angers, France.
² UMR 1345 Institut de Recherche en Horticulture et Semences (IRHS), INRAe, 42 Rue Georges Morel, 49071 Beaucouzé, France.
³ Department of data science, école d'ingénieur Informatique et Environnement (ESAIP), 49124 Angers, France.

Abstract

Since most computer vision approaches are now driven by machine learning, the current bottleneck is the annotation of images. This time-consuming task is usually performed manually after the acquisition of images. In this article, we assess the value of various egocentric vision approaches in regard to performing joint acquisition and automatic image annotation rather than the conventional two-step process of acquisition followed by manual annotation. This approach is illustrated with apple detection in challenging field conditions. We demonstrate the possibility of high performance in automatic apple segmentation (Dice 0.85), apple counting (88 percent of probability of good detection, and 0.09 true-negative rate), and apple localization (a shift error of fewer than 3 pixels) with eye-tracking systems. This is obtained by simply applying the areas of interest captured by the egocentric devices to standard, non-supervised image segmentation. We especially stress the importance in terms of time of using such eye-tracking devices on head-mounted systems to jointly perform image acquisition and automatic annotation. A gain of time of over 10-fold by comparison with classical image acquisition followed by manual image annotation is demonstrated.

Keywords: apple detection; egocentric vision; eye-tracking; image annotation.