Development and validation of a prediction equation for body fat percentage from measured BMI: a supervised machine learning approach

Sci Rep. 2023 May 17;13(1):8010. doi: 10.1038/s41598-023-33914-5.

Abstract

Body mass index is a widely used but poor predictor of adiposity in populations with excessive fat-free mass. Rigorous predictive models validated specifically in a nationally representative sample of the US population and that could be used for calibration purposes are needed. The objective of this study was to develop and validate prediction equations of body fat percentage obtained from Dual Energy X-ray Absorptiometry using body mass index (BMI) and socio-demographics. We used the National Health and Nutrition Examination Survey (NHANES) data from 5931 and 2340 adults aged 20 to 69 in 1999-2002 and 2003-2006, respectively. A supervised machine learning using ordinary least squares and a validation set approach were used to develop and select best models based on R2 and root mean square error. We compared our findings with other published models and utilized our best models to assess the amount of bias in the association between predicted body fat and elevated low-density lipoprotein (LDL). Three models included BMI, BMI2, age, gender, education, income, and interaction terms and produced R-squared values of 0.87 and yielded the smallest standard errors of estimation. The amount of bias in the association between predicted BF% and elevated LDL from our best model was -0.005. Our models provided strong predictive abilities and low bias compared to most published models. Its strengths rely on its simplicity and its ease of use in low-resource settings.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Absorptiometry, Photon
  • Adipose Tissue*
  • Body Composition*
  • Body Mass Index
  • Nutrition Surveys
  • Predictive Value of Tests
  • Reproducibility of Results
  • Sex Factors