Assessing the use of GEE methods for analyzing binary outcomes in family studies: the Strong Heart Family Study

J Biopharm Stat. 2024 Mar 29:1-13. doi: 10.1080/10543406.2024.2333516. Online ahead of print.

Abstract

The generalized estimating equations method (GEE) is commonly applied to analyze data obtained from family studies. GEE is well known for its robustness on misspecification of correlation structure. However, the unbalanced distribution of family sizes and complicated genetic relatedness structure within each family may challenge GEE performance. We focused our research on binary outcomes. To evaluate the performance of GEE, we conducted a series of simulations, on data generated adopting the kinship matrix (correlation structure within each family) from the Strong Heart Family Study (SHFS). We performed a fivefold cross-validation to further evaluate the GEE predictive power on data from the SHFS. A Bayesian modeling approach, with direct integration of the kinship matrix, was also included to contrast with GEE. Our simulation studies revealed that GEE performs well on a binary outcome from families having a relatively simple kinship structure. However, data with a binary outcome generated from families with complex kinship structures, especially with a large genetic variance, can challenge the performance of GEE.

Keywords: Bayesian analysis; GEE; family studies; kinship matrix; prediction.