Pupil Size Prediction Techniques Based on Convolution Neural Network

Allen Jong-Woei Whang; Yi-Yung Chen; Wei-Chieh Tseng; Chih-Hsien Tsai; Yi-Ping Chao; Chieh-Hung Yen; Chun-Hsiu Liu; Xin Zhang

doi:10.3390/s21154965

Pupil Size Prediction Techniques Based on Convolution Neural Network

Sensors (Basel). 2021 Jul 21;21(15):4965. doi: 10.3390/s21154965.

Authors

Allen Jong-Woei Whang¹, Yi-Yung Chen², Wei-Chieh Tseng¹, Chih-Hsien Tsai³, Yi-Ping Chao^{4

5

6}, Chieh-Hung Yen^{4

7

8}, Chun-Hsiu Liu^{7

8}, Xin Zhang¹

Affiliations

¹ Department of Electronic and Computer Engineering, National Taiwan University of Science and Technology, Taipei City 106335, Taiwan.
² Graduate Institute of Color & Illumination Technology, National Taiwan University of Science and Technology, Taipei City 106335, Taiwan.
³ Graduate Institute of Electro-Optical Engineering, National Taiwan University of Science and Technology, Taipei City 106335, Taiwan.
⁴ Graduate Institute of Biomedical Engineering, Chang Gung University, Taoyuan City 333323, Taiwan.
⁵ Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan City 333323, Taiwan.
⁶ Department of Neurology, Chang Gung Memorial Hospital at Linkou, Taoyuan City 333423, Taiwan.
⁷ Department of Ophthalmology, Chang Gung Memorial Hospital at Linkou, Taoyuan City 333423, Taiwan.
⁸ College of Medicine, Chang Gung University, Taoyuan City 333323, Taiwan.

Abstract

The size of one's pupil can indicate one's physical condition and mental state. When we search related papers about AI and the pupil, most studies focused on eye-tracking. This paper proposes an algorithm that can calculate pupil size based on a convolution neural network (CNN). Usually, the shape of the pupil is not round, and 50% of pupils can be calculated using ellipses as the best fitting shapes. This paper uses the major and minor axes of an ellipse to represent the size of pupils and uses the two parameters as the output of the network. Regarding the input of the network, the dataset is in video format (continuous frames). Taking each frame from the videos and using these to train the CNN model may cause overfitting since the images are too similar. This study used data augmentation and calculated the structural similarity to ensure that the images had a certain degree of difference to avoid this problem. For optimizing the network structure, this study compared the mean error with changes in the depth of the network and the field of view (FOV) of the convolution filter. The result shows that both deepening the network and widening the FOV of the convolution filter can reduce the mean error. According to the results, the mean error of the pupil length is 5.437% and the pupil area is 10.57%. It can operate in low-cost mobile embedded systems at 35 frames per second, demonstrating that low-cost designs can be used for pupil size prediction.

Keywords: biomedical imaging; computational intelligence; engineering in medicine and biology; machine learning.

MeSH terms

Algorithms*
Humans
Neural Networks, Computer*
Pupil

Grants and funding

109-2221-E-011-030-/Ministry of Science and Technology, Taiwan