An Expectation-Maximization Algorithm for Combining a Sample of Partially Overlapping Covariance Matrices

Axioms. 2023 Feb;12(2):161. doi: 10.3390/axioms12020161. Epub 2023 Feb 4.

Abstract

The generation of unprecedented amounts of data brings new challenges in data management, but also an opportunity to accelerate the identification of processes of multiple science disciplines. One of these challenges is the harmonization of high-dimensional unbalanced and heterogeneous data. In this manuscript, we propose a statistical approach to combine incomplete and partially-overlapping pieces of covariance matrices that come from independent experiments. We assume that the data are a random sample of partial covariance matrices sampled from Wishart distributions and we derive an expectation-maximization algorithm for parameter estimation. We demonstrate the properties of our method by (i) using simulation studies and (ii) using empirical datasets. In general, being able to make inferences about the covariance of variables not observed in the same experiment is a valuable tool for data analysis since covariance estimation is an important step in many statistical applications, such as multivariate analysis, principal component analysis, factor analysis, and structural equation modeling.

Keywords: 62H12; 62P10; 62h20; covariance estimation; expectation-maximization; heterogeneous databases; imputation; multi-view data.