Comparison of independent samples of high-dimensional data by pairwise distance measures

Biom J. 2007 Apr;49(2):230-41. doi: 10.1002/bimj.200510262.

Abstract

Pairwise distance or association measures of sample elements are often used as a basis for hierarchical cluster analyses. They can also be used in tests for the comparison of pre-defined subgroups of the total sample. Usually this is done with permutation tests In this paper, we compare such a procedure with alternative tests for high-dimensional data based on spherically distributed scores in simulation experiments and with real data. The tests based on the pairwise distance or similarity measures perform quite well in this comparison. As the number of possible permutations is small in very small samples, this might restrict the use of the test. Therefore, we propose an exact parametric small sample version of the test using randomly rotated samples.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • DNA Fingerprinting
  • DNA, Bacterial / chemistry
  • DNA, Bacterial / genetics
  • Data Interpretation, Statistical*
  • Humans
  • Models, Statistical*
  • Monte Carlo Method
  • Oligonucleotide Array Sequence Analysis
  • Polymerase Chain Reaction
  • Principal Component Analysis / methods*
  • Soil Microbiology
  • Thyroid Nodule / genetics

Substances

  • DNA, Bacterial