Pix2Pix-Based Monocular Depth Estimation for Drones with Optical Flow on AirSim

Sensors (Basel). 2022 Mar 8;22(6):2097. doi: 10.3390/s22062097.

Abstract

In this work, we propose a method for estimating depth for an image of a monocular camera in order to avoid a collision for the autonomous flight of a drone. The highest flight speed of a drone is generally approximate 22.2 m/s, and long-distant depth information is crucial for autonomous flights since if the long-distance information is not available, the drone flying at high speeds is prone to collisions. However, long-range, measurable depth cameras are too heavy to be equipped on a drone. This work applies Pix2Pix, which is a kind of Conditional Generative Adversarial Nets (CGAN). Pix2Pix generates depth images from a monocular camera. Additionally, this work applies optical flow to enhance the accuracy of depth estimation. In this work, we propose a highly accurate depth estimation method that effectively embeds an optical flow map into a monocular image. The models are trained with taking advantage of AirSim, which is one of the flight simulators. AirSim can take both monocular and depth images over a hundred meter in the virtual environment, and our model generates a depth image that provides the long-distance information than images captured by a common depth camera. We evaluate accuracy and error of our proposed method using test images in AirSim. In addition, the proposed method is utilized for flight simulation to evaluate the effectiveness to collision avoidance. As a result, our proposed method is higher accuracy and lower error than a state of work. Moreover, our proposed method is lower collision than a state of work.

Keywords: AirSim; depth estimation; optical flow.

MeSH terms

  • Optic Flow*
  • Unmanned Aerial Devices