Ear Detection Using Convolutional Neural Network on Graphs with Filter Rotation

Sensors (Basel). 2019 Dec 13;19(24):5510. doi: 10.3390/s19245510.

Abstract

Geometric deep learning (GDL) generalizes convolutional neural networks (CNNs) to non-Euclidean domains. In this work, a GDL technique, allowing the application of CNN on graphs, is examined. It defines convolutional filters with the use of the Gaussian mixture model (GMM). As those filters are defined in continuous space, they can be easily rotated without the need for some additional interpolation. This, in turn, allows constructing systems having rotation equivariance property. The characteristic of the proposed approach is illustrated with the problem of ear detection, which is of great importance in biometric systems enabling image based, discrete human identification. The analyzed graphs were constructed taking into account superpixels representing image content. This kind of representation has several advantages. On the one hand, it significantly reduces the amount of processed data, allowing building simpler and more effective models. On the other hand, it seems to be closer to the conscious process of human image understanding as it does not operate on millions of pixels. The contributions of the paper lie both in GDL application area extension (semantic segmentation of the images) and in the novel concept of trained filter transformations. We show that even significantly reduced information about image content and a relatively simple, in comparison with classic CNN, model (smaller number of parameters and significantly faster processing) allows obtaining detection results on the quality level similar to those reported in the literature on the UBEAR dataset. Moreover, we show experimentally that the proposed approach possesses in fact the rotation equivariance property allowing detecting rotated structures without the need for labor consuming training on all rotated and non-rotated images.

Keywords: Gaussian mixture model; ear detection; geometric deep learning; rotation equivariance; semantic segmentation; structured prediction; superpixels.