Deep Video Dehazing with Semantic Segmentation

IEEE Trans Image Process. 2018 Oct 15. doi: 10.1109/TIP.2018.2876178. Online ahead of print.

Abstract

Recent research have shown the potential of using convolutional neural networks (CNNs) to accomplish single image dehazing. In this work, we take one step further to explore the possibility of exploiting a network to perform haze removal for videos. Unlike single image dehazing, video based approaches can take advantage of the abundant information that exists across neighboring frames. In this work, assuming that a scene point yields highly correlated transmission values between adjacent video frames, we develop a deep learning solution for video dehazing, where a CNN is trained end-to-end to learn how to accumulate information across frames for transmission estimation. The estimated transmission map is subsequently used to recover a haze-free frame via atmospheric scattering model. In addition, as the semantic information of a scene provides a strong prior for image restoration, we propose to incorporate global semantic priors as input to regularize the transmission maps so that the estimated maps can be smooth in the regions of the same object and only discontinuous across the boundaries of different objects. To train this network, we generate a dataset consisted of synthetic hazy and haze-free videos for supervision based on the NYU depth dataset. We show that the features learned from this dataset are capable of removing haze that arises in outdoor scenes in a wide range of videos. Extensive experiments demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods on both synthetic and real-world videos.