GSDet: Object Detection in Aerial Images Based on Scale Reasoning

IEEE Trans Image Process. 2021:30:4599-4609. doi: 10.1109/TIP.2021.3073319. Epub 2021 Apr 29.

Abstract

Variations in both object scale and style under different capture scenes (e.g., downtown, port) greatly enhance the difficulties associated with object detection in aerial images. Although ground sample distance (GSD) provides an apparent clue to address this issue, no existing object detection methods have considered utilizing this useful prior knowledge. In this paper, we propose the first object detection network to incorporate GSD into the object detection modeling process. More specifically, built on a two-stage detection framework, we adopt a GSD identification subnet converting the GSD regression into a probability estimation process, then combine the GSD information with the sizes of Regions of Interest (RoIs) to determine the physical size of objects. The estimated physical size can provide a powerful prior for detection by reweighting the weights from the classification layer of each category to produce RoI-wise enhanced features. Furthermore, to improve the discriminability among categories of similar size and make the inference process more adaptive, the scene information is also considered. The pipeline is flexible enough to be stacked on any two-stage modern detection framework. The improvement over the existing two-stage object detection methods on the DOTA dataset demonstrates the effectiveness of our method.