Investigating the impact of attenuated fluorescence spectra on protein discrimination

Opt Express. 2023 Oct 23;31(22):35507-35518. doi: 10.1364/OE.499362.

Abstract

The optical remote sensing techniques are promising for the real-time detection, and identification of different types of hazardous biological materials. However, the received fluorescent spectra from a remote distance suffer from the atmospheric attenuation effect upon the spectral shape. To investigate the influence of atmospheric attenuation on characterizing, and classifying biological agents, the laboratory-measured fluorescence data of fourteen proteins combined with the atmospheric transmission factors of the MODTRAN model were conducted with different detection ranges. The multivariate analysis techniques of principal component analysis (PCA) and linear discriminant analysis (LDA), and the predictors of Random Forest and XGBoost were employed to assess the separability and distinguishability of different spectra recorded. The results showed that the spectral-shift effect on attenuated spectra varied as a function of the detection range, the atmospheric visibility, and the spectral distribution. According to the PCA and LDA analysis, the distribution of decomposed factors changed in the spectral explanatory power with the increasing attenuation effect, which was consistent with the hierarchical clustering results. Random Forest exhibited higher performance in classifying protein samples than that of XGBoost, while the two methods performed similarly in identifying harmful and harmless subgroups of proteins. Fewer subgroups decreased the sensitivity of the classification accuracy to the attenuation effect. Our analysis demonstrated that combining atmospheric transport models to build a fluorescence spectral database is essential for fast identification between spectra, and reduced classification criteria could facilitate the compatibility of spectral database and classification algorithms.

MeSH terms

  • Algorithms*
  • Discriminant Analysis
  • Multivariate Analysis
  • Principal Component Analysis
  • Random Forest*