Study on Representation Invariances of CNNs and Human Visual Information Processing Based on Data Augmentation

Yibo Cui; Chi Zhang; Kai Qiao; Linyuan Wang; Bin Yan; Li Tong

doi:10.3390/brainsci10090602

Study on Representation Invariances of CNNs and Human Visual Information Processing Based on Data Augmentation

Brain Sci. 2020 Sep 2;10(9):602. doi: 10.3390/brainsci10090602.

Authors

Yibo Cui¹, Chi Zhang¹, Kai Qiao¹, Linyuan Wang¹, Bin Yan¹, Li Tong¹

Affiliation

¹ Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China.

Abstract

Representation invariance plays a significant role in the performance of deep convolutional neural networks (CNNs) and human visual information processing in various complicated image-based tasks. However, there has been abounding confusion concerning the representation invariance mechanisms of the two sophisticated systems. To investigate their relationship under common conditions, we proposed a representation invariance analysis approach based on data augmentation technology. Firstly, the original image library was expanded by data augmentation. The representation invariances of CNNs and the ventral visual stream were then studied by comparing the similarities of the corresponding layer features of CNNs and the prediction performance of visual encoding models based on functional magnetic resonance imaging (fMRI) before and after data augmentation. Our experimental results suggest that the architecture of CNNs, combinations of convolutional and fully-connected layers, developed representation invariance of CNNs. Remarkably, we found representation invariance belongs to all successive stages of the ventral visual stream. Hence, the internal correlation between CNNs and the human visual system in representation invariance was revealed. Our study promotes the advancement of invariant representation of computer vision and deeper comprehension of the representation invariance mechanism of human visual information processing.

Keywords: CNNs; data augmentation; fMRI visual encoding model; human visual information processing; representation invariance.

Grants and funding

No. 2017YFB1002502/National Basic Research Program of China (973 Program)