Breaking CAPTCHA with Capsule Networks

Neural Netw. 2022 Oct:154:246-254. doi: 10.1016/j.neunet.2022.06.041. Epub 2022 Jul 8.

Abstract

Convolutional Neural Networks have achieved state-of-the-art performance in image classification. Their lack of ability to recognise the spatial relationship between features, however, leads to misclassification of the variants of the same image. Capsule Networks were introduced to address this issue by incorporating the spatial information of image features into neural networks. In this paper, we are interested in showcasing the digit recognition task on CAPTCHA images, widely considered a challenge for computers in relation to human capabilities. Our intention is to provide a rigorous empirical regime in which we can compare the competitive performance of Capsule Networks against the Convolutional Neural Networks. Indeed since CAPTCHA distorts images, by adjusting the spatial positioning of features, we aim to demonstrate the advantages and limitations of Capsule Networks architecture. We train the Capsule Networks with Dynamic Routing version and the convolutional-neural-network-based deep-CAPTCHA baseline model to predict the digit sequences on numerical CAPTCHAs, investigate the performance results and propose two improvements to the Capsule Networks model.

Keywords: CAPTCHA; Capsule networks; Convolutional neural networks; Digit recognition; Spatial invariance.

MeSH terms

  • Humans
  • Neural Networks, Computer*
  • Recognition, Psychology*