A two-way additive model with unknown group-specific interactions applied to gene expression data

Biom J. 2022 Aug;64(6):1007-1022. doi: 10.1002/bimj.202100282. Epub 2022 May 7.

Abstract

We propose a two-way additive model with group-specific interactions, where the group information is unknown. We treat the group membership as latent information and propose an EM algorithm for estimation. With a single observation matrix and under the situation of diverging row and column numbers, we rigorously establish the estimation consistency and asymptotic normality of our estimator. Extensive simulation studies are conducted to demonstrate the finite sample performance. We apply the model to the triple negative breast cancer (TNBC) gene expression data and provide a new way to classify patients into different subtypes. Our analysis detects the potential genes that may be associated with TNBC.

Keywords: EM algorithm; gene expression analysis; high-dimension problem; interaction effects; subgroup structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Gene Expression
  • Humans
  • Triple Negative Breast Neoplasms* / genetics