Mixture of Experts with Entropic Regularization for Data Classification

Billy Peralta; Ariel Saavedra; Luis Caro; Alvaro Soto

doi:10.3390/e21020190

Mixture of Experts with Entropic Regularization for Data Classification

Entropy (Basel). 2019 Feb 18;21(2):190. doi: 10.3390/e21020190.

Authors

Billy Peralta¹, Ariel Saavedra², Luis Caro², Alvaro Soto³

Affiliations

¹ Department of Engineering Science, Andres Bello University, Santiago 7500971, Chile.
² Department of Engineering Informatics, Catholic University of Temuco, Temuco 4781312, Chile.
³ Department of Computer Sciences, Pontifical Catholic University of Chile, Santiago 7820436, Chile.

Abstract

Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. "Mixture-of-experts" is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a "winner-takes-all" output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6% in some datasets. In future work, we plan to embed feature selection into this model.

Keywords: classification; entropy; mixture-of-experts; regularization.

Grants and funding

11140892/Fondo Nacional de Desarrollo Científico y Tecnológico