Multi-study factor analysis

Biometrics. 2019 Mar;75(1):337-346. doi: 10.1111/biom.12974. Epub 2019 Mar 8.

Abstract

We introduce a novel class of factor analysis methodologies for the joint analysis of multiple studies. The goal is to separately identify and estimate (1) common factors shared across multiple studies, and (2) study-specific factors. We develop an Expectation Conditional-Maximization algorithm for parameter estimates and we provide a procedure for choosing the numbers of common and specific factors. We present simulations for evaluating the performance of the method and we illustrate it by applying it to gene expression data in ovarian cancer. In both, we clarify the benefits of a joint analysis compared to the standard factor analysis. We have provided a tool to accelerate the pace at which we can combine unsupervised analysis across multiple studies, and understand the cross-study reproducibility of signal in multivariate data. An R package (MSFA), is implemented and is available on GitHub.

Keywords: Dimension reduction; ECM algorithm; cross-study analysis; gene expression; meta-analysis; reproducibility.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Factor Analysis, Statistical*
  • Female
  • Gene Expression
  • Humans
  • Immune System
  • Ovarian Neoplasms / genetics
  • Ovarian Neoplasms / immunology
  • Reproducibility of Results