基于近红外光谱技术的六大茶类快速识别

Rapid Identification of Six Major Tea Categories Based on Near-Infrared Spectroscopy

张灵枝 ¹黄艳 ²于英杰 ³林刚 ⁴孙威江⁵

扫码查看

作者信息

1. 福建农林大学园艺学院,福建福州 350002;福建农林大学海峡两岸特色作物安全生产省部共建协同创新中心,福建福州 350002
2. 福建农林大学园艺学院,福建福州 350002;福建农林大学安溪茶学院,福建泉州 362400;福建农林大学福建省茶产业工程技术研究中心,福建福州 350002;福建农林大学海峡两岸特色作物安全生产省部共建协同创新中心,福建福州 350002
3. 中国茶叶流通协会,北京 100801
4. 福建融韵通生态科技有限公司,福建福州 350025
5. 福建农林大学园艺学院,福建福州 350002;福建农林大学福建省茶产业工程技术研究中心,福建福州 350002;福建农林大学海峡两岸特色作物安全生产省部共建协同创新中心,福建福州 350002
折叠

摘要

为构建高质量的六大茶类识别模型,本研究中收集了 370份样品,通过采集其近红外光谱(near-infrared spectroscopy,NIRS),结合光谱预处理、特征提取以及数据挖掘分类器算法,建立六大茶类快速识别模型.结果表明:1)支持向量机(support vector machine,SVM)与随机森林(random forest,RF)分类器皆适于六大茶类快速识别模型的构建;2)SVM分类器更适于结合原始光谱(original spectrum,OS)建模,预处理易使基于该分类器建立的模型鉴别性能减弱;3)随机森林(RF)分类器更适用于预处理后光谱建模,所得模型较OS模型在识别正确率(recognition accuracy,RA)及受试者工作特征曲线下面积(area under the curve,AUC)均得到明显提升;4)特征提取中线性判别分析(linear discriminant analysis,LDA)算法表现最好,所得模型的RA较OS模型明显提升,其中最佳模型OS-LDA-SVM的RA为100.00％,AUC为1.00,识别正确率高、泛化能力强、模型性能优异,可产业化应用.综上所述,近红外光谱结合预处理、特征提取算法及分类器建立模型,进行六大茶类识别的可行性强,模型的识别正确率高、性能优异,可为茶叶贸易的茶类快速识别提供科学、准确、高效的技术支撑,为国际茶类识别模型的产业化应用奠定基础.

Abstract

In order to construct a high-quality recognition model for the six major tea categories,this study selected 370 samples and collected their near-infrared spectroscopy(NIRS).A rapid recognition model for the six major tea categories was developed by combined these data with spectral pre-processing,feature extraction and data mining classifier algorithms.The results indicated that:1)Support vector machine(SVM)and random forest(RF)classifiers were both suitable for constructing rapid identification models for the six tea categories.2)The SVM classifier was more suitable for modeling with the original spectrum(OS),and pre-processing algorithms tended to weaken the discriminatory performance of the models based on this classifier.3)The RF algorithm was more suitable for modeling with pre-processing spectra,and the resulting models had a significant improvement in recognition accuracy(RA)and area under the curve(AUC)of the receiver operating characteristic curve compared to the OS models.4)Among the feature extraction algorithms,the linear discriminant analysis(LDA)algorithm performed the best,yielding models with significantly improved RA compared to OS models.The optimal model,OS-LDA-SVM,achieved RA of 100.00％and AUC of 1.00,demonstrating high recognition rate,strong generalization ability,excellent model performance,and potential in industrial application.In summary,NIRS combined with pre-processing,feature extraction algorithms and classifiers to build models for the identification of the six tea categories was highly feasible.The models have high recognition accuracy and excellent performance,providing scientific,accurate,and efficient technical support for the rapid identification of tea categories in the tea trade,which could lay the foundation for the industrial application of international tea category identification models.

关键词

近红外光谱/茶类识别/支持向量机/随机森林/线性判别分析

Key words

near-infrared spectroscopy(NIRS)/tea category recognition/support vector machine(SVM)/random forest(RF)/linear discriminant analysis(LDA)

引用本文复制引用

基金项目

国家"十三五"重点研发计划项目(2019YFD1001601)

中国白茶研究院开放课题(KHCZ2101A)

中国白茶研究院开放课题(KHCZ2104A)

福建农林大学茶产业链科技创新与服务体系建设项目(K1520005A04)

福建张天福茶叶发展基金会科技创新基金(FJZTF01)

出版年

2024

食品与生物技术学报

江南大学

食品与生物技术学报

CSTPCDCSCD北大核心

影响因子：0.674

ISSN：1673-1689

被引量2

参考文献量3

段落导航