High-dimensional regression with ordered multiple categorical predictors

Stat Med. 2020 Feb 10;39(3):294-309. doi: 10.1002/sim.8400. Epub 2019 Nov 28.

Abstract

Models for the ordered multiple categorical (OMC) response variable have already been extensively established and widely applied, but few studies have investigated linear regression problems with OMC predictors, especially in high-dimensional situations. In such settings, the pseudocategories of the discrete variable and other irrelevant explanatory variables need to be automatically selected. This paper introduces a transformation method of dummy variables for such OMC predictors, an L1 penalty regression method is proposed based on the transformation. Model selection consistency of the proposed method is derived under some common assumptions for high-dimensional situation. Both simulation studies and real data analysis present good performance of this method, showing its wide applicability in relevant regression analysis.

Keywords: dummy variables; multicategory; penalized regression; sign consistency; transformation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Humans
  • Multivariate Analysis*
  • Regression Analysis*