Tongue gestural target classification is of great interest to researchers in the speech production field. Recently, deep convolutional neural networks (CNN) have shown superiority to standard feature extraction techniques in a variety of domains. In this letter, both CNN-based speaker-dependent and speaker-independent tongue gestural target classification experiments are conducted to classify tongue gestures during natural speech production. The CNN-based method achieves state-of-the-art performance, even though no pre-training of the CNN (with the exception of a data augmentation preprocessing) was carried out.