Stochastic proximity embedding on graphics processing units: taking multidimensional scaling to a new scale

J Chem Inf Model. 2011 Nov 28;51(11):2852-9. doi: 10.1021/ci200420c. Epub 2011 Oct 21.

Abstract

Stochastic proximity embedding (SPE) was developed as a method for efficiently calculating lower dimensional embeddings of high-dimensional data sets. Rather than using a global minimization scheme, SPE relies upon updating the distances of randomly selected points in an iterative fashion. This was found to generate embeddings of comparable quality to those obtained using classical multidimensional scaling algorithms. However, SPE is able to obtain these results in O(n) rather than O(n²) time and thus is much better suited to large data sets. In an effort both to speed up SPE and utilize it for even larger problems, we have created a multithreaded implementation which takes advantage of the growing general computing power of graphics processing units (GPUs). The use of GPUs allows the embedding of data sets containing millions of data points in interactive time scales.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Computer Graphics
  • Computers
  • Databases, Factual
  • Drug Discovery / methods*
  • Drug Discovery / statistics & numerical data
  • Software*