aPRIDIT Unsupervised Classification with Asymmetric Valuation of Variable Discriminatory Worth

Multivariate Behav Res. 2020 Sep-Oct;55(5):685-703. doi: 10.1080/00273171.2019.1665979. Epub 2019 Sep 27.

Abstract

Sometimes one needs to classify individuals into groups, but there is no available grouping information due to social desirability bias in reporting behavior like unethical or dishonest intentions or unlawful actions. Assessing hard-to-detect behaviors is useful; however it is methodologically difficult because people are unlikely to self-disclose bad actions. This paper presents an unsupervised classification methodology utilizing ordinal categorical predictor variables. It allows for classification, individual respondent ranking, and grouping without access to a dependent group indicator variable. The methodology also measures predictor variable worth (for determining target behavior group membership) at a predictor variable category-by-category level, so different variable response categories can contain different amounts of information about classification. It is asymmetric in that a "0" on a binary predictor does not have a similar impact toward signaling "membership in the target group" as a "1" has for signaling "membership in the non-target group." The methodology is illustrated by identifying Spanish consumers filing fraudulent insurance claims. A second illustration classifies Portuguese high school student's propensity to alcohol abuse. Results show the methodology is useful when it is difficult to get dependent variable information, and is useful for deciding which predictor variables and categorical response options are most important.

Keywords: Detecting hidden behavior; asymmetric measures; classification into non-self-disclosed behavior groups; non-parametric classification; unsupervised learning.