3D similarity-dissimilarity plot for high dimensional data visualization in the context of biomedical pattern classification

J Med Syst. 2013 Jun;37(3):9944. doi: 10.1007/s10916-013-9944-5. Epub 2013 Apr 13.

Abstract

In real life biomedical classification applications, it is difficult to visualize the feature space due to high dimensionality of the feature space. In this paper, we have proposed 3D similarity-dissimilarity plot to project the high dimensional space to a three dimensional space in which important information about the feature space can be extracted in the context of pattern classification. In this plot it is possible to visualize good data points (data points near to their own class as compared to other classes) and bad data points (data points far away from their own class) and outlier points (data points away from both their own class and other classes). Hence separation of classes can easily be visualized. Density of the data points near each other can provide some useful information about the compactness of the clusters within certain class. Moreover, an index called percentage of data points above the similarity-dissimilarity line (PAS) is proposed which is the fraction of data points above the similarity-dissimilarity line. Several synthetic and real life biomedical datasets are used to show the effectiveness of the proposed 3D similarity-dissimilarity plot.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Humans
  • Pattern Recognition, Automated*