Sorting of Mixed Oryza sativa L.Seeds by Terahertz Spectrum and Feature Selection Algorithm
Agricultural production safety is an important component of food safety.The Oryza sativa subsp.japonica Kato.as a daily edible rice,rapid inspection of low-quality mixed seeds is an impor-tant research work in related fields.In this study,spectral signals of 220 samples of mixed and pure rice varieties were collected using terahertz time-domain spectroscopy,and the spectral data were pre-processed by Fourier transform(FT),and the time-domain signals were converted into frequence-do-main signals as modeling data sets.Five pattern recognition models such as QUSET were compared for sorting research.The results show that random forest(RF)algorithm,successive projections al-gorithm(SPA),variable combination population analysis-iteratively retaining information variables algorithm(VCPA-IRIV)were selected,and the three algorithms selected 9,6 and 25 important fea-ture frequencies respectively,in which VCPA-IRIV as the characteristic frequency selected by the coupling algorithm contained the most abundant spectral information.In order to further optimize the model,the modeling after characteristic frequency selection was significantly superior to the full-spec-trum modeling method in terms of analysis speed and recognition accuracy.The QUEST and KNN classification based on 25 characteristic frequencies screened by the VCPA-IRIV algorithm could both had 100%identification accuracy.The variable cluster analysis coupled iterative retention algorithm could effectively select the characteristic frequency of terahertz spectrum containing rich information,and could effectively improve the accuracy of the established recognition model.The identification model based on terahertz spectrum and coupled feature selection algorithm was fast and accurate,and could be used for detecting poor quality Oryza sativa subsp.japonica Kato.seeds to offer a new ap-proach.