Using a centered general linear model for detection of interactions among biomarkers

Stat Methods Med Res. 2024 Mar;33(3):414-432. doi: 10.1177/09622802231224639. Epub 2024 Feb 6.

Abstract

The dummy variable based general linear model (gLM) is commonly used to model categorical factors and their interactions. However, the main factors and their interactions in a general linear model are often correlated even when the factors are independently distributed. Alternatively, the classical two-way factorial analysis of variance (ANOVA) model can avoid the correlation between the main factors and their interactions when the main factors are independent. But the ANOVA model is hardly applicable to a regular linear regression model especially in the presence of other covariates due to constraints on its model parameters. In this study, a centered general linear model (cgLM) is proposed for modeling interactions between categorical factors based on their centered dummy variables. We show that the cgLM can avoid the correlation between the main factors and their interactions as the ANOVA model when the main factors are independent. Meanwhile, similar to gLM, it can be used in regular regression and fitted conveniently using the standard least square approach by choosing appropriate baselines to avoid constraints on its model parameters. The potential advantage of cgLM over gLM for detection of interactions in model building procedures is also illustrated and compared via a simulation study. Finally, the cgLM is applied to a postmortem brain gene expression data set.

Keywords: General linear model; interactions; least square estimates; mean corrections; two-way factorial analysis of variance.

MeSH terms

  • Analysis of Variance
  • Biomarkers
  • Brain*
  • Computer Simulation
  • Linear Models

Substances

  • Biomarkers