Variable selection in finite mixture of regression models using the skew-normal distribution

J Appl Stat. 2019 Dec 31;47(16):2941-2960. doi: 10.1080/02664763.2019.1709051. eCollection 2020.

Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example.

Keywords: 62F35; 62H30; 62J07; Hard; LASSO; SCAD; Variable selection; mixture regression models; skew-normal distribution.

Grants and funding

This work is partially supported by the National Natural Science Foundation of China (11861041; 11261025).