Learning to school in dense configurations with multi-agent deep reinforcement learning

Yi Zhu; Jian-Hua Pang; Tong Gao; Fang-Bao Tian

doi:10.1088/1748-3190/ac9fb5

Learning to school in dense configurations with multi-agent deep reinforcement learning

Bioinspir Biomim. 2022 Nov 16;18(1). doi: 10.1088/1748-3190/ac9fb5.

Authors

Yi Zhu¹, Jian-Hua Pang^{1

2}, Tong Gao³, Fang-Bao Tian⁴

Affiliations

¹ Ocean Intelligence Technology Center, Shenzhen Institute of Guangdong Ocean University, Shenzhen, Guangdong 518055, People's Republic of China.
² College of Ocean Engineering, Guangdong Ocean University, Zhanjiang, Guangdong 524088, People's Republic of China.
³ Department of Mechanical Engineering, Michigan State University, East Lansing, MI 48864, United States of America.
⁴ School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2600, Australia.

PMID: 36322983
DOI: 10.1088/1748-3190/ac9fb5

Abstract

Fish are observed to school in different configurations. However, how and why fish maintain a stable schooling formation still remains unclear. This work presents a numerical study of the dense schooling of two free swimmers by a hybrid method of the multi-agent deep reinforcement learning and the immersed boundary-lattice Boltzmann method. Active control policies are developed by synchronously training the leader to swim at a given speed and orientation and the follower to hold close proximity to the leader. After training, the swimmers could resist the strong hydrodynamic force to remain in stable formations and meantime swim in desired path, only by their tail-beat flapping. The tail movement of the swimmers in the stable formations are irregular and asymmetrical, indicating the swimmers are carefully adjusting their body-kinematics to balance the hydrodynamic force. In addition, a significant decrease in the mean amplitude and the cost of transport is found for the followers, indicating these swimmers could maintain the swimming speed with less efforts. The results also show that the side-by-side formation is hydrodynamically more stable but energetically less efficient than other configurations, while the full-body staggered formation is energetically more efficient as a whole.

Keywords: collective motion; fish schooling; immersed boundary-lattice Boltzmann method; multi-agent deep reinforcement learning; side-by-side swimming; staggered swimming.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Biomechanical Phenomena
Fishes
Hydrodynamics
Reinforcement, Psychology*
Schools
Swimming*