基于高光谱成像技术的山楂产地判别研究
Identification of Geographical Origin for Hawthorn Based on Hyperspectral Imaging Technology
刘子健 1顾佳盛 1周聪 2王游游 3杨健 2黄俊 1王宏鹏 1白瑞斌2
作者信息
- 1. 浙江科技大学生物与化学工程学院,浙江杭州 310023
- 2. 中国中医科学院中药资源中心道地药材品质保障与资源持续利用全国重点实验室,北京 100700;江西省道地药材质量评价研究中心,江西南昌 330000
- 3. 中国中医科学院中药资源中心道地药材品质保障与资源持续利用全国重点实验室,北京 100700
- 折叠
摘要
产地是影响山楂品质的重要因素之一,为实现对山楂产地的快速无损鉴别,本文基于高光谱成像技术对不同产地的山楂样品进行产地溯源.以五个不同省级产区的山楂样品作为样本,利用近红外高光谱成像系统,获得每个样品果梗朝上(G)、侧面朝上(C)和底面朝上(D)的可见-短波红外(410~2500 nm)波段高光谱数据.采用多元散射校正(multivariate scattering correction,MSC)、一阶导数(first derivative,D1)、SG平滑(Savitzky-Golay,SG)和标准正态变换(standard normal variate transformation,SNV)四种预处理方法,分别建立了偏最小二乘判别分析(partial least squares discriminant analysis,PLS-DA)、支持向量机(support vector machine,SVM)和随机森林(random forests,RF)三种分类模型.结果表明,D-D1-SVM模型分类效果最优,训练集和预测集的准确率均为 100%.为进一步简化模型,分别采用连续投影算法(successive projections algorithm,SPA)和竞争性自适应重加权算法(competitive adaptive reweighted sampling algorithm,CARS)进行特征波长筛选.通过多变量数据分析发现,D-SPA-SVM模型效果最佳,训练集和预测集准确率分别为 95.2%和 93%.本研究为山楂产地的快速、无损识别提供技术支持.
Abstract
The geographical origin was one of the important factors affecting the quality of hawthorn.To discriminate the geographical origin of hawthorn rapidly and nondestructively,hawthorns from five different provincial production areas were used as samples,and visible-shortwave infrared(410~2500 nm)band hyperspectral data were obtained for the pedicel face(G),side(C),and bottom(D)of each sample by using a near-infrared hyperspectral imaging system.Partial least squares discriminant analysis(PLS-DA),support vector machine(SVM),and random forests(RF)classification models were built by multivariate scattering correction(MSC),first derivative(D1),SG smoothing(Savitzky-Golay,SG),and standard normal transform(SNV)four preprocessing methods.The results showed that the D-D1-SVM model discriminated optimally,with 100%accuracy in both the training and prediction sets.To simplify the model,successive projections algorithm(SPA)and competitive adaptive reweighted sampling algorithm(CARS)were applied to select feature wavelength.The multivariate data analysis revealed that the D-SPA-SVM model had the best performance,with an accuracy of 95.2%and 93%for the training and prediction sets,respectively.This study could provide technical support for rapid and non-destructive identification of hawthorn origin.
关键词
高光谱成像技术/山楂/产地识别/无损检测/机器学习Key words
hyperspectral imaging technology/hawthorn/origin identification/nondestructive testing/machine learning引用本文复制引用
基金项目
浙江省"领雁"攻关计划(2022C02023)
中央本级重大增减支项目(2060302)
浙江科技大学科研业务费专项(2023QN024)
浙江科技大学科研业务费专项(2023JLZD007)
中药全产业链质量技术服务平台项目(2022-230-221)
出版年
2024