Towards Transfer Learning Techniques-BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study

Rafael Silva Barbon; Ademar Takeo Akabane

doi:10.3390/s22218184

Towards Transfer Learning Techniques-BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study

Sensors (Basel). 2022 Oct 26;22(21):8184. doi: 10.3390/s22218184.

Authors

Rafael Silva Barbon¹, Ademar Takeo Akabane¹

Affiliation

¹ Postgraduate Program in Urban Infrastructure Systems and Telecommunication Networks Management, Centre for Exact Sciences, Technology and the Environment (CEATEC), Pontifical Catholic University of Campinas (PUC-Campinas), 1516 Professor Dr. Euryclides de Jesus Zerbini, Campinas 13086900, SP, Brazil.

Abstract

The Internet of Things is a paradigm that interconnects several smart devices through the internet to provide ubiquitous services to users. This paradigm and Web 2.0 platforms generate countless amounts of textual data. Thus, a significant challenge in this context is automatically performing text classification. State-of-the-art outcomes have recently been obtained by employing language models trained from scratch on corpora made up from news online to handle text classification better. A language model that we can highlight is BERT (Bidirectional Encoder Representations from Transformers) and also DistilBERT is a pre-trained smaller general-purpose language representation model. In this context, through a case study, we propose performing the text classification task with two previously mentioned models for two languages (English and Brazilian Portuguese) in different datasets. The results show that DistilBERT's training time for English and Brazilian Portuguese was about 45% faster than its larger counterpart, it was also 40% smaller, and preserves about 96% of language comprehension skills for balanced datasets.

Keywords: BERT; BERTimbau; DistilBERT; DistilBERTimbau; big data; pre-trained model; transformer-based machine learning.

MeSH terms

Deep Learning*
Language
Learning
Natural Language Processing
Social Media*

Grants and funding

405498/2021-7/National Council for Scientific and Technological Development