Learning intraoperative organ manipulation with context-based reinforcement learning

Claudia D'Ettorre; Silvia Zirino; Neri Niccolò Dei; Agostino Stilli; Elena De Momi; Danail Stoyanov

doi:10.1007/s11548-022-02630-2

Learning intraoperative organ manipulation with context-based reinforcement learning

Int J Comput Assist Radiol Surg. 2022 Aug;17(8):1419-1427. doi: 10.1007/s11548-022-02630-2. Epub 2022 May 3.

Authors

Claudia D'Ettorre¹, Silvia Zirino^{2

3}, Neri Niccolò Dei⁴, Agostino Stilli², Elena De Momi³, Danail Stoyanov²

Affiliations

¹ Wellcome/EPSRC Centre for International and Surgical Sciences (WEISS), University College London, London, UK. c.dettorre@ucl.ac.uk.
² Wellcome/EPSRC Centre for International and Surgical Sciences (WEISS), University College London, London, UK.
³ Department of Electronics, Information and Bioengineering (NearLab), Politecnico of Milan, Milan, Italy.
⁴ The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy.

Abstract

Purpose: Automation of sub-tasks during robotic surgery is challenging due to the high variability of the surgical scenes intra- and inter-patients. For example, the pick and place task can be executed different times during the same operation and for distinct purposes. Hence, designing automation solutions that can generalise a skill over different contexts becomes hard. All the experiments are conducted using the Pneumatic Attachable Flexible (PAF) rail, a novel surgical tool designed for robotic-assisted intraoperative organ manipulation.

Methods: We build upon previous open-source surgical Reinforcement Learning (RL) training environment to develop a new RL framework for manipulation skills, rlman. In rlman, contextual RL agents are trained to solve different aspects of the pick and place task using the PAF rail system. rlman is implemented to support both low- and high-dimensional state information to solve surgical sub-tasks in a simulation environment.

Results: We use rlman to train state of the art RL agents to solve four different surgical sub-tasks involving manipulation skills using the PAF rail. We compare the results with state-of-the-art benchmarks found in the literature. We evaluate the ability of the agent to be able to generalise over different aspects of the targeted surgical environment.

Conclusion: We have shown that the rlman framework can support the training of different RL algorithms for solving surgical sub-task, analysing the importance of context information for generalisation capabilities. We are aiming to deploy the trained policy on the real da Vinci using the dVRK and show that the generalisation of the trained policy can be transferred to the real world.

Keywords: Computer-assisted intervention; Reinforcement learning; Robotic surgery; Surgical automation.

MeSH terms

Algorithms
Computer Simulation
Humans
Learning*
Robotic Surgical Procedures* / education

Abstract

MeSH terms

Grants and funding