Sequence-Based Prediction of Plant Allergenic Proteins: Machine Learning Classification Approach

ACS Omega. 2023 Jan 20;8(4):3698-3704. doi: 10.1021/acsomega.2c02842. eCollection 2023 Jan 31.

Abstract

This Article proposes a novel chemometric approach to understanding and exploring the allergenic nature of food proteins. Using machine learning methods (supervised and unsupervised), this work aims to predict the allergenicity of plant proteins. The strategy is based on scoring descriptors and testing their classification performance. Partitioning was based on support vector machines (SVM), and a k-nearest neighbor (KNN) classifier was applied. A fivefold cross-validation approach was used to validate the KNN classifier in the variable selection step as well as the final classifier. To overcome the problem of food allergies, a robust and efficient method for protein classification is needed.