Fast Deep Stacked Networks based on Extreme Learning Machine applied to regression problems

Bruno Légora Souza da Silva; Fernando Kentaro Inaba; Evandro Ottoni Teatini Salles; Patrick Marques Ciarelli

doi:10.1016/j.neunet.2020.07.018

Fast Deep Stacked Networks based on Extreme Learning Machine applied to regression problems

Neural Netw. 2020 Nov:131:14-28. doi: 10.1016/j.neunet.2020.07.018. Epub 2020 Jul 19.

Authors

Bruno Légora Souza da Silva¹, Fernando Kentaro Inaba², Evandro Ottoni Teatini Salles³, Patrick Marques Ciarelli⁴

Affiliations

¹ Department of Electrical Engineering, Universidade Federal do Espírito Santo, Av. Fernando Ferrari, 514, Vitória, Espírito Santo, Brazil. Electronic address: bruno.l.silva@aluno.ufes.br.
² Department of Electrical Engineering, Universidade Federal do Espírito Santo, Av. Fernando Ferrari, 514, Vitória, Espírito Santo, Brazil. Electronic address: kentaro@ele.ufes.br.
³ Department of Electrical Engineering, Universidade Federal do Espírito Santo, Av. Fernando Ferrari, 514, Vitória, Espírito Santo, Brazil. Electronic address: evandro@ele.ufes.br.
⁴ Department of Electrical Engineering, Universidade Federal do Espírito Santo, Av. Fernando Ferrari, 514, Vitória, Espírito Santo, Brazil. Electronic address: patrick.ciarelli@ufes.br.

PMID: 32721826
DOI: 10.1016/j.neunet.2020.07.018

Abstract

Deep learning techniques are commonly used to process large amounts of data, and good results are obtained in many applications. Those methods, however, can lead to long training times. An alternative to simultaneously tune all parameters of a large network is to stack smaller modules, improving the model efficiency. However, methods such as Deep Stacked Network (DSN) have some problems that increase its training time and memory usage. To deal with these problems, Fast DSN (FDSN) was proposed, where the modules are trained using an Extreme Learning Machine (ELM) variant. Nonetheless, to speed-up the FDSN training, the ELM random feature mapping is shared among the modules, which can impact the network performance if the weights are not properly chosen. In this paper, we focus on the weight initialization of FDSN in order to improve its performance. We also propose FKDSN, a kernel-based variant of FDSN, besides discussing the theoretical complexity of the methods. We evaluate three different initialization approaches on ELM-trained neural networks over 50 public real-world regression datasets. Our experiments show that FDSN when combined with a more complex initialization method achieves similar results to ELM algorithms applied to large SLFNs, besides having a shorter training time and memory usage, implying that it can be suitable to be used on systems with restrict resources, such as Internet of Things devices. FKDSN also obtained similar results and training time to the large SLFNs, requiring less memory.

Keywords: Deep Stacked Network; Extreme Learning Machine; Regression; Stacking principle.

MeSH terms

Machine Learning*
Regression Analysis