Spectral multidimensional scaling

Proc Natl Acad Sci U S A. 2013 Nov 5;110(45):18052-7. doi: 10.1073/pnas.1308708110. Epub 2013 Oct 9.

Abstract

An important tool in information analysis is dimensionality reduction. There are various approaches for large data simplification by scaling its dimensions down that play a significant role in recognition and classification tasks. The efficiency of dimension reduction tools is measured in terms of memory and computational complexity, which are usually a function of the number of the given data points. Sparse local operators that involve substantially less than quadratic complexity at one end, and faithful multiscale models with quadratic cost at the other end, make the design of dimension reduction procedure a delicate balance between modeling accuracy and efficiency. Here, we combine the benefits of both and propose a low-dimensional multiscale modeling of the data, at a modest computational cost. The idea is to project the classical multidimensional scaling problem into the data spectral domain extracted from its Laplace-Beltrami operator. There, embedding into a small dimensional Euclidean space is accomplished while optimizing for a small number of coefficients. We provide a theoretical support and demonstrate that working in the natural eigenspace of the data, one could reduce the process complexity while maintaining the model fidelity. As examples, we efficiently canonize nonrigid shapes by embedding their intrinsic metric into , a method often used for matching and classifying almost isometric articulated objects. Finally, we demonstrate the method by exposing the style in which handwritten digits appear in a large collection of images. We also visualize clustering of digits by treating images as feature points that we map to a plane.

Keywords: big data; diffusion geometry; distance maps; flat embedding.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Data Interpretation, Statistical*
  • Mathematical Concepts*
  • Models, Theoretical*
  • Principal Component Analysis / methods