Machine learning-based analysis of adolescent gambling factors

J Behav Addict. 2020 Oct 3;9(3):734-743. doi: 10.1556/2006.2020.00063. Print 2020 Oct 12.

Abstract

Background and aims: Problem gambling among adolescents has recently attracted attention because of easy access to gambling in online environments and its serious effects on adolescent lives. We proposed a machine learning-based analysis method for predicting the degree of problem gambling.

Methods: Of the 17,520 respondents in the 2018 National Survey on Youth Gambling Problems dataset (collected by the Korea Center on Gambling Problems), 5,045 students who had gambled in the past 3 months were included in this study. The Gambling Problem Severity Scale was used to provide the binary label information. After the random forest-based feature selection method, we trained four models: random forest (RF), support vector machine (SVM), extra trees (ETs), and ridge regression.

Results: The online gambling behavior in the past 3 months, experience of winning money or goods, and gambling of personal relationship were three factors exhibiting the high feature importance. All four models demonstrated an area under the curve (AUC) of >0.7; ET showed the highest AUC (0.755), RF demonstrated the highest accuracy (71.8%), and SVM showed the highest F1 score (0.507) on a testing set.

Discussion: The results indicate that machine learning models can convey meaningful information to support predictions regarding the degree of problem gambling.

Conclusion: Machine learning models trained using important features showed moderate accuracy in a large-scale Korean adolescent dataset. These findings suggest that the method will help screen adolescents at risk of problem gambling. We believe that expandable machine learning-based approaches will become more powerful as more datasets are collected.

Keywords: adolescents; feature engineering; machine learning-based analysis method; problem gambling.

MeSH terms

  • Adolescent
  • Adolescent Behavior / physiology*
  • Female
  • Gambling / physiopathology*
  • Health Surveys
  • Humans
  • Machine Learning*
  • Male
  • Severity of Illness Index
  • Support Vector Machine