[Study of Modeling Samples Selection Method Based on Near Infrared Spectrum]

Guang Pu Xue Yu Guang Pu Fen Xi. 2016 Dec;36(12):3920-5.
[Article in Chinese]

Abstract

For more wheat varieties classification problem, we use near infrared spectrumto do qualitative analysis. Increasing the size of modeling sample could increase information of the model, however, at the same time, it also makes information redundancy so that modeling time and storage space will increase, thus, we need to decrease the size of modeling sample though selecting them. Some information must be lost and the effects of the model must be worse if we select samples blindly. We put forward the k nearest neighbor-density sample selection based on the traditional selection methods. Experiments use the near infrared diffuse reflection spectrum of wheat seed from lots of days. First, we use preprocessing and feature extraction to deal with the wheat original spectrum, then select modeling sample by three methods that are random sampling, k nearest neighbor and k nearest neighbor-density, finally, we establish the models of BPR(Biomimetic Pattern Recognition) and BPRI(Biomimetic Pattern Recognition Improved). The experimental results show that in the model of BPR we get the best results using the selection method of k nearest neighbor-density, especially it also decreases the size of modeling sample deeply, and in the model of BPRI the results using the selection method of k nearest neighbor-density are much better than random sampling and a little better than k nearest neighbor, but in the meanwhile the size of modeling sample using the selection method of k nearest neighbor-density are much smaller than k nearest neighbor. The experimental results prove that the sample selection method of k nearest neighbor-density can not only greatly reduce the modeling sample size, and ensure the quality of the model, it has obvious effect on varieties classification problem of wheat.

Publication types

  • English Abstract