Feature screening with latent responses

Biometrics. 2023 Jun;79(2):878-890. doi: 10.1111/biom.13658. Epub 2022 Mar 31.

Abstract

A novel feature screening method is proposed to examine the correlation between latent responses and potential predictors in ultrahigh-dimensional data analysis. First, a confirmatory factor analysis (CFA) model is used to characterize latent responses through multiple observed variables. The expectation-maximization algorithm is employed to estimate the parameters in the CFA model. Second, R-Vector (RV) correlation is used to measure the dependence between the multivariate latent responses and covariates of interest. Third, a feature screening procedure is proposed on the basis of an unbiased estimator of the RV coefficient. The sure screening property of the proposed screening procedure is established under certain mild conditions. Monte Carlo simulations are conducted to assess the finite-sample performance of the feature screening procedure. The proposed method is applied to an investigation of the relationship between psychological well-being and the human genome.

Keywords: RV coefficient; latent response; sure independent screening; ultrahigh-dimensional data analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Factor Analysis, Statistical
  • Genome, Human*
  • Humans
  • Monte Carlo Method