Using Structural Equation Modeling to Examine the Influence of Social, Behavioral, and Nutritional Variables on Health Outcomes Based on NHANES Data: Addressing Complex Design, Nonnormally Distributed Variables, and Missing Information

Micah L Hartwell; Jam Khojasteh; Marianna S Wetherill; Julie M Croff; Denna Wheeler

doi:10.1093/cdn/nzz010

Using Structural Equation Modeling to Examine the Influence of Social, Behavioral, and Nutritional Variables on Health Outcomes Based on NHANES Data: Addressing Complex Design, Nonnormally Distributed Variables, and Missing Information

Curr Dev Nutr. 2019 Feb 4;3(5):nzz010. doi: 10.1093/cdn/nzz010. eCollection 2019 May.

Authors

Micah L Hartwell^{1

2

3}, Jam Khojasteh², Marianna S Wetherill⁴, Julie M Croff^{3

5}, Denna Wheeler^{3

5}

Affiliations

¹ School of Community Health Sciences, Counseling and Counseling Psychology, Oklahoma State University, Stillwater, OK.
² School of Education Foundations, Leadership and Aviation, Oklahoma State University, Stillwater, OK.
³ Master of Public Health Program, Center for Health Sciences, Oklahoma State University, Tulsa, OK.
⁴ Department of Health Promotion Sciences, Hudson College of Public Health, University of Oklahoma Tulsa Schusterman Center, Tulsa, OK.
⁵ Department of Rural Health, Center for Health Sciences, Oklahoma State University, Tulsa, OK.

Abstract

Background: Structural equation modeling (SEM) is a multivariate analysis method for exploring relations between latent constructs and measured variables. As a theory-guided approach, SEM estimates directional pathways in complex models based on longitudinal or cross-sectional data where randomized control trials would either be unethical or cost prohibitive. However, this method is infrequently used in nutrition research, despite recommendations by epidemiologists for its increased use.

Objectives: The aim of this study was to explore 3 key methodologic areas for consideration by researchers when conducting SEM with complex survey datasets: the use of sampling weights, treatment of missing data, and model estimation techniques.

Methods: With the use of data from NHANES waves 2005-2010, we developed an SEM to estimate the relation between the latent construct of depression and measured variables of food security, tobacco use (serum cotinine), and age. We used a hierarchic approach to compare 5 SEM model iterations through the use of: 1 and 2) complete cases without and with the application of sampling weights; 3) an applied missingness dataset to test the accuracy of multiple imputation (MI); 4) the full NHANES dataset with imputed data and sampling weights; and 5) a final respecified model. Each iteration was conducted with maximum likelihood (ML) and quasimaximum likelihood with the Satorra-Bentler correction (QML) to compare path coefficients, standard errors, and model fit statistics.

Results: Path coefficients differed between 15.68% and 19.17% among model iterations. Nearly one-third of the cases had missing values, and MI reliably imputed values, allowing all cases to be represented in the final model iterations. QML provided better model fit statistics in all iterations.

Conclusions: Nutrition epidemiologists should use complex weights, MI, and QML as a best-practices approach to SEM when conducting analyses with complex design survey data.

Keywords: NHANES; Structural equation modeling; complex survey design; multiple imputation; quasi-maximum likelihood.

Grants and funding

P20 GM109097/GM/NIGMS NIH HHS/United States