Background and aims: Problem gambling among adolescents has recently attracted attention because of easy access to gambling in online environments and its serious effects on adolescent lives. We proposed a machine learning-based analysis method for predicting the degree of problem gambling.
Methods: Of the 17,520 respondents in the 2018 National Survey on Youth Gambling Problems dataset (collected by the Korea Center on Gambling Problems), 5,045 students who had gambled in the past 3 months were included in this study. The Gambling Problem Severity Scale was used to provide the binary label information. After the random forest-based feature selection method, we trained four models: random forest (RF), support vector machine (SVM), extra trees (ETs), and ridge regression.
Results: The online gambling behavior in the past 3 months, experience of winning money or goods, and gambling of personal relationship were three factors exhibiting the high feature importance. All four models demonstrated an area under the curve (AUC) of >0.7; ET showed the highest AUC (0.755), RF demonstrated the highest accuracy (71.8%), and SVM showed the highest F1 score (0.507) on a testing set.
Discussion: The results indicate that machine learning models can convey meaningful information to support predictions regarding the degree of problem gambling.
Conclusion: Machine learning models trained using important features showed moderate accuracy in a large-scale Korean adolescent dataset. These findings suggest that the method will help screen adolescents at risk of problem gambling. We believe that expandable machine learning-based approaches will become more powerful as more datasets are collected.
Keywords: adolescents; feature engineering; machine learning-based analysis method; problem gambling.