High-speed and robust infrared-guiding multiuser eye localization system for autostereoscopic display

Appl Opt. 2020 May 10;59(14):4199-4208. doi: 10.1364/AO.386903.

Abstract

In order to localize the viewers' eyes, a high-speed and robust infrared-guiding multiuser eye localization system was fabricated in this paper for a binocular autostereoscopic display, which can project a pair of parallax images to corresponding eyes. The system is composed of a low-resolution thermal infrared camera, a pair of high-resolution left and right visible spectral cameras, and an industrial computer. The infrared camera and the left visible spectral camera, and the left and right visible spectral camera, can both form the binocular vision system. The thermal infrared camera can capture the thermography images. The left and right visible spectral cameras can capture the left and right visible spectral images, respectively. Owing to the temperature difference between the face and background, the features of the face in thermography images are prominent. We use the YOLO-V3 neural network to detect the viewers' faces in thermography images. Owing to the different features of the pseudo and real faces in the infrared spectral, in the thermography images, the pseudo-faces can be easily eliminated. According to the positions and sizes of potential bounding boxes of the detected faces in the thermography images, the industrial computer can be guided to determine the left candidate regions in the left visible spectral image. Then, the industrial computer can determine the right candidate regions in the right visible spectral image. In the left candidate regions, the industrial computer detects the faces and localize the eyes by using the SeetaFace algorithm. The template matching is performed between the left and right candidate regions to calculate the accurate distance between the viewer and the system. The average detection time of the proposed method is about 3-8 ms. Compared with traditional methods, the localization time is improved by 86.7%-90.1%. Further, the proposed method is hardly influenced by the pseudo-faces and the strong ambient light.