A Maximum Margin Approach for Semisupervised Ordinal Regression Clustering

Yanshan Xiao; Bo Liu; Zhifeng Hao

doi:10.1109/TNNLS.2015.2434960

A Maximum Margin Approach for Semisupervised Ordinal Regression Clustering

IEEE Trans Neural Netw Learn Syst. 2016 May;27(5):1003-19. doi: 10.1109/TNNLS.2015.2434960. Epub 2015 Jul 1.

Authors

Yanshan Xiao, Bo Liu, Zhifeng Hao

PMID: 26151945
DOI: 10.1109/TNNLS.2015.2434960

Abstract

Ordinal regression (OR) is generally defined as the task where the input samples are ranked on an ordinal scale. OR has found a wide variety of applications, and a great deal of work has been done on it. However, most of the existing work focuses on supervised/semisupervised OR classification, and the semisupervised OR clustering problems have not been explicitly addressed. In real-world OR applications, labeling a large number of training samples is usually time-consuming and costly, and instead, a set of unlabeled samples can be utilized to set up the OR model. Moreover, although the sample labels are unavailable, we can sometimes get the relative ranking information of the unlabeled samples. This sample ranking information can be utilized to refine the OR model. Hence, how to build an OR model on the unlabeled samples and incorporate the sample ranking information into the process of improving the clustering accuracy remains a key challenge for OR applications. In this paper, we consider the semisupervised OR clustering problems with sample-ranking constraints, which give the relative ranking information of the unlabeled samples, and put forward a maximum margin approach for semisupervised OR clustering ( [Formula: see text]SORC). On one hand, [Formula: see text]SORC seeks a set of parallel hyperplanes to partition the unlabeled samples into clusters. On the other hand, a loss function is put forward to incorporate the sample ranking information into the clustering process. As a result, the optimization function of [Formula: see text]SORC is formulated to maximize the margins of the closest neighboring clusters and meanwhile minimize the loss associated with the sample-ranking constraints. Extensive experiments on OR data sets show that the proposed [Formula: see text]SORC method outperforms the traditional semisupervised clustering methods considered.

Publication types

Research Support, Non-U.S. Gov't