A Framework for Sensorimotor Cross-Perception and Cross-Behavior Knowledge Transfer for Object Categorization

Front Robot AI. 2020 Oct 9:7:522141. doi: 10.3389/frobt.2020.522141. eCollection 2020.

Abstract

From an early age, humans learn to develop an intuition for the physical nature of the objects around them by using exploratory behaviors. Such exploration provides observations of how objects feel, sound, look, and move as a result of actions applied on them. Previous works in robotics have shown that robots can also use such behaviors (e.g., lifting, pressing, shaking) to infer object properties that camera input alone cannot detect. Such learned representations are specific to each individual robot and cannot currently be transferred directly to another robot with different sensors and actions. Moreover, sensor failure can cause a robot to lose a specific sensory modality which may prevent it from using perceptual models that require it as input. To address these limitations, we propose a framework for knowledge transfer across behaviors and sensory modalities such that: (1) knowledge can be transferred from one or more robots to another, and, (2) knowledge can be transferred from one or more sensory modalities to another. We propose two different models for transfer based on variational auto-encoders and encoder-decoder networks. The main hypothesis behind our approach is that if two or more robots share multi-sensory object observations of a shared set of objects, then those observations can be used to establish mappings between multiple features spaces, each corresponding to a combination of an exploratory behavior and a sensory modality. We evaluate our approach on a category recognition task using a dataset in which a robot used 9 behaviors, coupled with 4 sensory modalities, performed multiple times on 100 objects. The results indicate that sensorimotor knowledge about objects can be transferred both across behaviors and across sensory modalities, such that a new robot (or the same robot, but with a different set of sensors) can bootstrap its category recognition models without having to exhaustively explore the full set of objects.

Keywords: category learning and recognition; development of representations; grounding of knowledge; haptic and tactile perception; multimodal perception and integration.