Gender differences in under-reporting hiring discrimination in Korea: a machine learning approach

Epidemiol Health. 2021:43:e2021099. doi: 10.4178/epih.e2021099. Epub 2021 Nov 17.

Abstract

Objectives: This study was conducted to examine gender differences in under-reporting hiring discrimination by building a prediction model for workers who responded "not applicable (NA)" to a question about hiring discrimination despite being eligible to answer.

Methods: Using data from 3,576 wage workers in the seventh wave (2004) of the Korea Labor and Income Panel Study, we trained and tested 9 machine learning algorithms using "yes" or "no" responses regarding the lifetime experience of hiring discrimination. We then applied the best-performing model to estimate the prevalence of experiencing hiring discrimination among those who answered "NA." Under-reporting of hiring discrimination was calculated by comparing the prevalence of hiring discrimination between the "yes" or "no" group and the "NA" group.

Results: Based on the predictions from the random forest model, we found that 58.8% of the "NA" group were predicted to have experienced hiring discrimination, while 19.7% of the "yes" or "no" group reported hiring discrimination. Among the "NA" group, the predicted prevalence of hiring discrimination for men and women was 45.3% and 84.8%, respectively.

Conclusions: This study introduces a methodological strategy for epidemiologic studies to address the under-reporting of discrimination by applying machine learning algorithms.

Keywords: Machine learning; Social discrimination; Social epidemiology.

MeSH terms

  • Female
  • Humans
  • Machine Learning*
  • Male
  • Republic of Korea / epidemiology
  • Sex Factors