EL V.2 Model for Predicting Food Safety Risks at Taiwan Border Using the Voting-Based Ensemble Method

Li-Ya Wu; Fang-Ming Liu; Sung-Shun Weng; Wen-Chou Lin

doi:10.3390/foods12112118

EL V.2 Model for Predicting Food Safety Risks at Taiwan Border Using the Voting-Based Ensemble Method

Foods. 2023 May 24;12(11):2118. doi: 10.3390/foods12112118.

Authors

Li-Ya Wu¹, Fang-Ming Liu¹, Sung-Shun Weng², Wen-Chou Lin¹

Affiliations

¹ Food and Drug Administration, Ministry of Welfare, Taipei 115209, Taiwan.
² Department of Information and Finance Management, National Taipei University of Technology, Taipei 10608, Taiwan.

Abstract

Border management serves as a crucial control checkpoint for governments to regulate the quality and safety of imported food. In 2020, the first-generation ensemble learning prediction model (EL V.1) was introduced to Taiwan's border food management. This model primarily assesses the risk of imported food by combining five algorithms to determine whether quality sampling should be performed on imported food at the border. In this study, a second-generation ensemble learning prediction model (EL V.2) was developed based on seven algorithms to enhance the "detection rate of unqualified cases" and improve the robustness of the model. In this study, Elastic Net was used to select the characteristic risk factors. Two algorithms were used to construct the new model: The Bagging-Gradient Boosting Machine and Bagging-Elastic Net. In addition, F_β was used to flexibly control the sampling rate, improving the predictive performance and robustness of the model. The chi-square test was employed to compare the efficacy of "pre-launch (2019) random sampling inspection" and "post-launch (2020-2022) model prediction sampling inspection". For cases recommended for inspection by the ensemble learning model and subsequently inspected, the unqualified rates were 5.10%, 6.36%, and 4.39% in 2020, 2021, and 2022, respectively, which were significantly higher (p < 0.001) compared with the random sampling rate of 2.09% in 2019. The prediction indices established by the confusion matrix were used to further evaluate the prediction effects of EL V.1 and EL V.2, and the EL V.2 model exhibited superior predictive performance compared with EL V.1, and both models outperformed random sampling.

Keywords: border management; ensemble learning; food safety; machine learning; risk prediction.

Grants and funding

This research received no external funding.