On generalized latent factor modeling and inference for high-dimensional binomial data

Biometrics. 2023 Sep;79(3):2311-2320. doi: 10.1111/biom.13768. Epub 2022 Oct 25.

Abstract

We explore a hierarchical generalized latent factor model for discrete and bounded response variables and in particular, binomial responses. Specifically, we develop a novel two-step estimation procedure and the corresponding statistical inference that is computationally efficient and scalable for the high dimension in terms of both the number of subjects and the number of features per subject. We also establish the validity of the estimation procedure, particularly the asymptotic properties of the estimated effect size and the latent structure, as well as the estimated number of latent factors. The results are corroborated by a simulation study and for illustration, the proposed methodology is applied to analyze a dataset in a gene-environment association study.

Keywords: Discrete bounded data; eigenanalysis; gene-environment association; generalized linear mixed model; sub-Gaussian error.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computer Simulation*
  • Statistics as Topic*