SuperFast: 200× Video Frame Interpolation via Event Camera

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7764-7780. doi: 10.1109/TPAMI.2022.3224051. Epub 2023 May 5.

Abstract

Traditional frame-based video frame interpolation (VFI) methods rely on the linear motion assumption and brightness invariance assumption, which may lead to fatal errors confronting the scenarios with high-speed motions. To tackle the above challenge, inspired by the advantages of event cameras on asynchronously recording brightness changes at each pixel, we propose a Fast-Slow joint synthesis framework for event-enhanced high-speed video frame interpolation, named SuperFast, in this paper, which can generate high frame rate (5000 FPS, 200× faster) video from the input low frame rate (25 FPS) video and the corresponding event stream. In our framework, the task is divided into two sub-tasks, i.e., video frame interpolation for the contents with and without high-speed motions, which are tackled by two corresponding branches, i.e., the fast synthesis pathway and the slow synthesis pathway. The fast synthesis pathway leverages a spiking neural network to encode the input event stream, and combines boundary frames to generate intermediate results through synthesis and refinement, targeting on contents with high-speed motions. The slow synthesis pathway stacks the two input boundary frames and the event stream to synthesize intermediate results, focusing on relatively slow-motion contents. Finally, a fusion module with a comparison loss is utilized to generate the final video frame interpolation results. We also build a hybrid visual acquisition system containing an event camera and a high frame rate camera, and collect the first 5000 FPS High-Speed Event-enhanced Video frame Interpolation (THU[Formula: see text]) dataset. To evaluate the performance of our proposed framework, we have conducted experiments on our THU[Formula: see text] dataset and the existing HS-ERGB dataset. Experimental results demonstrate that our proposed framework can achieve state-of-the-art 200× video frame interpolation performance under high-speed motion scenarios.