Statistical Methods for the Analysis of Food Composition Databases: A Review

Yusentha Balakrishna; Samuel Manda; Henry Mwambi; Averalda van Graan

doi:10.3390/nu14112193

Statistical Methods for the Analysis of Food Composition Databases: A Review

Nutrients. 2022 May 25;14(11):2193. doi: 10.3390/nu14112193.

Authors

Yusentha Balakrishna^{1

2}, Samuel Manda^{2

3}, Henry Mwambi², Averalda van Graan^{4

5}

Affiliations

¹ Biostatistics Research Unit, South African Medical Research Council, Durban 4001, South Africa.
² School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Pietermaritzburg 3201, South Africa.
³ Department of Statistics, University of Pretoria, Pretoria 0028, South Africa.
⁴ Biostatistics Research Unit, SAFOODS Division, South African Medical Research Council, Cape Town 8001, South Africa.
⁵ Division of Human Nutrition, Department of Global Health, Stellenbosch University, Cape Town 8001, South Africa.

Abstract

Evidence-based knowledge of the relationship between foods and nutrients is needed to inform dietary-based guidelines and policy. Proper and tailored statistical methods to analyse food composition databases (FCDBs) could assist in this regard. This review aims to collate the existing literature that used any statistical method to analyse FCDBs, to identify key trends and research gaps. The search strategy yielded 4238 references from electronic databases of which 24 fulfilled our inclusion criteria. Information on the objectives, statistical methods, and results was extracted. Statistical methods were mostly applied to group similar food items (37.5%). Other aims and objectives included determining associations between the nutrient content and known food characteristics (25.0%), determining nutrient co-occurrence (20.8%), evaluating nutrient changes over time (16.7%), and addressing the accuracy and completeness of databases (16.7%). Standard statistical tests (33.3%) were the most utilised followed by clustering (29.1%), other methods (16.7%), regression methods (12.5%), and dimension reduction techniques (8.3%). Nutrient data has unique characteristics such as correlated components, natural groupings, and a compositional nature. Statistical methods used for analysis need to account for this data structure. Our summary of the literature provides a reference for researchers looking to expand into this area.

Keywords: clustering; dimension reduction; food composition database; nutrient database; regression; review; statistical methods.

Publication types

Review

MeSH terms

Cluster Analysis
Databases, Factual
Food
Food Analysis
Nutrients*
Nutrition Policy*

Grants and funding

Y.B., S.M. and A.v.G. time on this research was funded by the South African Medical Research Council.