Predictive data clustering of laser-induced breakdown spectroscopy for brain tumor analysis

Biomed Opt Express. 2021 Jun 24;12(7):4438-4451. doi: 10.1364/BOE.431356. eCollection 2021 Jul 1.

Abstract

Limited by the lack of training spectral data in different kinds of tissues, the diagnostic accuracy of laser-induced breakdown spectroscopy (LIBS) is hard to reach the desired level with normal supervised learning identification methods. In this paper, we proposed to apply the predictive data clustering methods with supervised learning methods together to identify tissue information accurately. The meanshift clustering method is introduced to compare with three other clustering methods which have been used in LIBS field. We proposed the cluster precision (CP) score as a new criterion to work with Calinski-Harabasz (CH) score together for the evaluation of the clustering effect. The influences of principal component analysis (PCA) on all four kinds of clustering methods are also analyzed. PCA-meanshift shows the best clustering effect based on the comprehensive evaluation combined CH and CP scores. Based on the spatial location and feature similarity information provided by the predictive clustering, the PCA-Meanshift can improve diagnosis accuracy from less than 95% to 100% for all classifiers including support vector machine (SVM), k nearest neighbor (k-NN), soft independent modeling of class analogy (Simca) and random forests (RF) models.