Wild Terrestrial Animal Re-Identification Based on an Improved Locally Aware Transformer with a Cross-Attention Mechanism

Animals (Basel). 2022 Dec 12;12(24):3503. doi: 10.3390/ani12243503.

Abstract

The wildlife re-identification recognition methods based on the camera trap were used to identify different individuals of the same species using the fur, stripes, facial features and other features of the animal body surfaces in the images, which is an important way to count the individual number of a species. Re-identification of wild animals can provide solid technical support for the in-depth study of the number of individuals and living conditions of rare wild animals, as well as provide accurate and timely data support for population ecology and conservation biology research. However, due to the difficulty of recording the shy wild animals and distinguishing the similar fur of different individuals, only a few papers have focused on the re-identification recognition of wild animals. In order to fill this gap, we improved the locally aware transformer (LA transformer) network structure for the re-identification recognition of wild terrestrial animals. First of all, at the stage of feature extraction, we replaced the self-attention module of the LA transformer with a cross-attention block (CAB) in order to calculate the inner-patch attention and cross-patch attention, so that we could efficiently capture the global information of the animal body's surface and local feature differences of fur, colors, textures, or faces. Then, the locally aware network of the LA transformer was used to fuse the local and global features. Finally, the classification layer of the network realized wildlife individual recognition. In order to evaluate the performance of the model, we tested it on a dataset of Amur tiger torsos and the face datasets of six different species, including lions, golden monkeys, meerkats, red pandas, tigers, and chimpanzees. The experimental results showed that our wildlife re-identification model has good generalization ability and is superior to the existing methods in mAP (mean average precision), and obtained comparable results in the metrics Rank 1 and Rank 5.

Keywords: cross-attention mechanism; local feature differences; locally aware transformer; wild terrestrial animal re-identification.

Grants and funding

This research received no external funding.