An efficient model-free approach to interaction screening for high dimensional data

Stat Med. 2023 May 10;42(10):1583-1605. doi: 10.1002/sim.9688. Epub 2023 Mar 1.

Abstract

An innovated model-free interaction screening procedure called the MCVIS is proposed for high dimensional data analysis. Specifically, we adopt the introduced MCV index for quantifying the importance of an interaction effect among predictors. Our proposed method is fully nonparametric and is capable of successfully selecting interactions even if the signal of parental main effects is weak. The MCVIS procedure has many distinctive features: (i) it can work with discrete, categorical and continuous covariates; (ii) it can deal with both categorical and continuous response, even handle the missing response; (iii) it is robust for heavy-tailed distributions, thus well accommodates heterogeneity typically caused by high dimensionality; (iv) it enjoys the sure screening and ranking consistency properties, therefore achieves dimension reduction without information loss. In another respect, computational feasibility is a top concern in high dimensional data analysis, by transforming our MCV into several variants, the MCVIS procedure is simple and fast to implement. Extensive numerical experiments and comparisons confirm the effectiveness and wide applicability of our MCVIS procedure. We further illustrate the proposed methodology by empirical study of two real datasets. Supplementary materials are available online.

Keywords: conditional variance; interaction effects; missing response; slicing; sure screening.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Analysis*
  • Humans