Shared Gaussian Process Latent Variable Model for Incomplete Multiview Clustering

Ping Li; Songcan Chen

doi:10.1109/TCYB.2018.2863790

Shared Gaussian Process Latent Variable Model for Incomplete Multiview Clustering

IEEE Trans Cybern. 2020 Jan;50(1):61-73. doi: 10.1109/TCYB.2018.2863790. Epub 2018 Aug 30.

Authors

Ping Li, Songcan Chen

PMID: 30176618
DOI: 10.1109/TCYB.2018.2863790

Abstract

These days, many multiview learning methods have been proposed by integrating the complementary information of multiple views and can significantly improve the performance of machine learning tasks comparing with single-view learning methods. However, most of these methods fail to learn better models when the multiview data are unpaired (or partially paired) or incomplete (or partially complete). Although some previous attempts have been made to address these problems, these methods often lead to poor results when dealing with incomplete multiview data that contain a relatively large number of missing instances. In fact, this incomplete problem is more challenging than the unpaired problem since less shared information can be caught by the model in the former case. In this paper, we propose a shared Gaussian process (GP) latent variable model for incomplete multiview clustering to gain the merits of two worlds (i.e., GP and multiview learning). Specifically, it learns a set of intentionally aligned representative auxiliary points in individual views jointly to not only compensate for missing instances but also implement the group-level constraint. Thus, the shared information among these views can be explicitly built into the model. All of the hyper-parameters and auxiliary points are simultaneously learned by variational inference. Compared with the existing methods, our method naturally inherits the advantages of GP. Furthermore, it is also straightforwardly extended to cases with more than two views without adding any complexity in formulation. In the experiments, we compare it with the state-of-the-art methods for incomplete multiview data clustering to demonstrate its superiorities.