Machine learning-based analysis of adolescent gambling factors

Wonju Seo; Namho Kim; Sang-Kyu Lee; Sung-Min Park

doi:10.1556/2006.2020.00063

Machine learning-based analysis of adolescent gambling factors

J Behav Addict. 2020 Oct 3;9(3):734-743. doi: 10.1556/2006.2020.00063. Print 2020 Oct 12.

Authors

Wonju Seo¹, Namho Kim¹, Sang-Kyu Lee², Sung-Min Park¹

Affiliations

¹ 1Department of Creative IT Engineering, Pohang University of Science and Technology, 77 Cheongam-ro, Nam-gu, Pohang, 37673, Republic of Korea.
² 2Department of Psychology, College of Medicine, Hallym University, 1 Hallymdaehak-gil, Chuncheon, 24252, Republic of Korea.

Abstract

Background and aims: Problem gambling among adolescents has recently attracted attention because of easy access to gambling in online environments and its serious effects on adolescent lives. We proposed a machine learning-based analysis method for predicting the degree of problem gambling.

Methods: Of the 17,520 respondents in the 2018 National Survey on Youth Gambling Problems dataset (collected by the Korea Center on Gambling Problems), 5,045 students who had gambled in the past 3 months were included in this study. The Gambling Problem Severity Scale was used to provide the binary label information. After the random forest-based feature selection method, we trained four models: random forest (RF), support vector machine (SVM), extra trees (ETs), and ridge regression.

Results: The online gambling behavior in the past 3 months, experience of winning money or goods, and gambling of personal relationship were three factors exhibiting the high feature importance. All four models demonstrated an area under the curve (AUC) of >0.7; ET showed the highest AUC (0.755), RF demonstrated the highest accuracy (71.8%), and SVM showed the highest F1 score (0.507) on a testing set.

Discussion: The results indicate that machine learning models can convey meaningful information to support predictions regarding the degree of problem gambling.

Conclusion: Machine learning models trained using important features showed moderate accuracy in a large-scale Korean adolescent dataset. These findings suggest that the method will help screen adolescents at risk of problem gambling. We believe that expandable machine learning-based approaches will become more powerful as more datasets are collected.

Keywords: adolescents; feature engineering; machine learning-based analysis method; problem gambling.

MeSH terms

Adolescent
Adolescent Behavior / physiology*
Female
Gambling / physiopathology*
Health Surveys
Humans
Machine Learning*
Male
Severity of Illness Index
Support Vector Machine