Deep Instance Segmentation and Visual Servoing to Play Jenga with a Cost-Effective Robotic System

Luca Marchionna; Giulio Pugliese; Mauro Martini; Simone Angarano; Francesco Salvetti; Marcello Chiaberge

doi:10.3390/s23020752

Deep Instance Segmentation and Visual Servoing to Play Jenga with a Cost-Effective Robotic System

Sensors (Basel). 2023 Jan 9;23(2):752. doi: 10.3390/s23020752.

Authors

Luca Marchionna¹, Giulio Pugliese¹, Mauro Martini^{1

2}, Simone Angarano^{1

2}, Francesco Salvetti^{1

2}, Marcello Chiaberge^{1

2}

Affiliations

¹ Department of Electronics and Telecommunications (DET), Politecnico di Torino, 10129 Torino, Italy.
² PIC4SeR Interdepartmental Centre for Service Robotics, 10129 Torino, Italy.

Abstract

The game of Jenga is a benchmark used for developing innovative manipulation solutions for complex tasks. Indeed, it encourages the study of novel robotics methods to successfully extract blocks from a tower. A Jenga game involves many traits of complex industrial and surgical manipulation tasks, requiring a multi-step strategy, the combination of visual and tactile data, and the highly precise motion of a robotic arm to perform a single block extraction. In this work, we propose a novel, cost-effective architecture for playing Jenga with e.Do, a 6DOF anthropomorphic manipulator manufactured by Comau, a standard depth camera, and an inexpensive monodirectional force sensor. Our solution focuses on a visual-based control strategy to accurately align the end-effector with the desired block, enabling block extraction by pushing. To this aim, we trained an instance segmentation deep learning model on a synthetic custom dataset to segment each piece of the Jenga tower, allowing for visual tracking of the desired block's pose during the motion of the manipulator. We integrated the visual-based strategy with a 1D force sensor to detect whether the block could be safely removed by identifying a force threshold value. Our experimentation shows that our low-cost solution allows e.DO to precisely reach removable blocks and perform up to 14 consecutive extractions in a row.

Keywords: Jenga; deep instance segmentation; robotic arm; sensor fusion; visual servoing.

MeSH terms

Cost-Benefit Analysis
Robotics* / methods

Grants and funding

This research received no external funding.