AI-based betting anomaly detection system to ensure fairness in sports and prevent illegal gambling

Changgyun Kim; Jae-Hyeon Park; Ji-Yong Lee

doi:10.1038/s41598-024-57195-8

AI-based betting anomaly detection system to ensure fairness in sports and prevent illegal gambling

Sci Rep. 2024 Mar 18;14(1):6470. doi: 10.1038/s41598-024-57195-8.

Authors

Changgyun Kim¹, Jae-Hyeon Park², Ji-Yong Lee³

Affiliations

¹ Department of Artificial Intelligence & Software, Kangwon National University, Samcheok, 25913, Republic of Korea.
² Center for Sports and Performance Analysis, Korea National Sport University, Seoul, 05541, Republic of Korea.
³ Center for Sports and Performance Analysis, Korea National Sport University, Seoul, 05541, Republic of Korea. 302479@knsu.ac.kr.

Abstract

This study develops a solution to sports match-fixing using various machine-learning models to detect match-fixing anomalies, based on betting odds. We use five models to distinguish between normal and abnormal matches: logistic regression (LR), random forest (RF), support vector machine (SVM), the k-nearest neighbor (KNN) classification, and the ensemble model-a model optimized from the previous four. The models classify normal and abnormal matches by learning their patterns using sports betting odds data. The database was developed based on the world football league match betting data of 12 betting companies, which offered a vast collection of data on players, teams, game schedules, and league rankings for football matches. We develop an abnormal match detection model based on the data analysis results of each model, using the match result dividend data. We then use data from real-time matches and apply the five models to construct a system capable of detecting match-fixing in real time. The RF, KNN, and ensemble models recorded a high accuracy, over 92%, whereas the LR and SVM models were approximately 80% accurate. In comparison, previous studies have used a single model to examine football match betting odds data, with an accuracy of 70-80%.

Keywords: Ensemble model; Logistic regression; Match-fixing; Random forest; Support vector machine; k-Nearest neighbor.

MeSH terms

Artificial Intelligence
Football*
Gambling*
Humans
Logistic Models

Grants and funding

NRF-2020S1A5A2A03044544/This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea