SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions

Saba Arshad; Tae-Hyoung Park

doi:10.3390/s24030906

SVS-VPR: A Semantic Visual and Spatial Information-Based Hierarchical Visual Place Recognition for Autonomous Navigation in Challenging Environmental Conditions

Sensors (Basel). 2024 Jan 30;24(3):906. doi: 10.3390/s24030906.

Authors

Saba Arshad¹, Tae-Hyoung Park²

Affiliations

¹ Industrial Artificial Intelligence Research Center, Chungbuk National University, Cheongju 28644, Republic of Korea.
² Department of Intelligent Systems and Robotics, Chungbuk National University, Cheongju 28644, Republic of Korea.

Abstract

Robust visual place recognition (VPR) enables mobile robots to identify previously visited locations. For this purpose, the extracted visual information and place matching method plays a significant role. In this paper, we critically review the existing VPR methods and group them into three major categories based on visual information used, i.e., handcrafted features, deep features, and semantics. Focusing the benefits of convolutional neural networks (CNNs) and semantics, and limitations of existing research, we propose a robust appearance-based place recognition method, termed SVS-VPR, which is implemented as a hierarchical model consisting of two major components: global scene-based and local feature-based matching. The global scene semantics are extracted and compared with pre-visited images to filter the match candidates while reducing the search space and computational cost. The local feature-based matching involves the extraction of robust local features from CNN possessing invariant properties against environmental conditions and a place matching method utilizing semantic, visual, and spatial information. SVS-VPR is evaluated on publicly available benchmark datasets using true positive detection rate, recall at 100% precision, and area under the curve. Experimental findings demonstrate that SVS-VPR surpasses several state-of-the-art deep learning-based methods, boosting robustness against significant changes in viewpoint and appearance while maintaining efficient matching time performance.

Keywords: convolution features; neural networks; semantic segmentation; visual place recognition.

Grants and funding

IITP-2024-2020-0-01462/Innovative Human Resource Development for Local Intellectualization program through the Institute of Information & Communications Technology Planning & Evaluation(IITP)