中国酿造2024,Vol.43Issue(1) :184-189.DOI:10.11882/j.issn.0254-5071.2024.01.029

面向不平衡数据集的浓香型白酒基酒等级分类研究

Research on grade classification of strong-flavor Baijiu base liquor based on unbalanced data sets

王继华 李兆飞 杨壮 赵娜 张贵宇
中国酿造2024,Vol.43Issue(1) :184-189.DOI:10.11882/j.issn.0254-5071.2024.01.029

面向不平衡数据集的浓香型白酒基酒等级分类研究

Research on grade classification of strong-flavor Baijiu base liquor based on unbalanced data sets

王继华 1李兆飞 1杨壮 1赵娜 1张贵宇1
扫码查看

作者信息

  • 1. 四川轻化工大学 人工智能四川省重点实验室,四川宜宾 644000;四川轻化工大学 自动化与信息工程学院,四川 宜宾 644000
  • 折叠

摘要

为解决基于气相色谱-质谱联用(GC-MS)仪采集的浓香型白酒基酒等级分类中样本不均衡导致分类模型性能下降的问题,提出了一种面向不平衡数据集的浓香型白酒基酒分类研究.该方法首先采用合成少数类过采样技术(SMOTE)对浓香型基酒样品中少数类样本进行扩充,改善样本的不均衡性;然后结合稀疏主成分分析(SPCA)对GC-MS图谱数据进行降维;最后使用深度森林(DF)分类器建立浓香型白酒基酒分类识别模型.结果表明,使用SMOTE算法对基酒数据集进行平衡之后能够有效提高模型分类准确率,所建立的浓香型基酒分类模型正确率达到96.61%,该分类模型的建立对基酒等级分类能起到一定的指导和借鉴作用.

Abstract

In order to solve the problem of unbalanced samples which causing a decrease in the performance of classification models of base liquor of strong-flavor(Nongxiangxing)Baijiu collected by gas chromatography-mass spectrometry(GC-MS),a classification study of strong-flavor Baijiu base liquor for unbalanced data sets was proposed.In the method,a few class samples of strong-flavor Baijiu base liquor were expanded by using the syn-thetic minority over sampling technique(SMOTE)to improve the unbalanced of samples.Then the dimensions of GC-MS data were reduced by using sparse principal component analysis(SPCA).Finally,the classification and recognition model of strong-flavor Baijiu base liquor was established by using deep forest(DF)classifier.The results showed that the model classification accuracy rate could be effectively improved after using SMOTE algorithm to balance the base liquor data set,the accuracy of the established classification model for strong-flavor Baijiu base liquor reached 96.61%,and the establishment of the classification model could play a certain guidance and reference role for grade classification of base liquor.

关键词

气相色谱-质谱联用/浓香型白酒基酒/合成少数类过采样技术/稀疏主成分分析/基酒分类

Key words

gas chromatography-mass spectrometry/strong-flavor Baijiu base liquor/synthetic minority over-sampling technique/sparse principal component analysis/base liquor classification

引用本文复制引用

基金项目

四川省自贡市科技局重点科技计划(2019YYJC15)

四川轻化工大学科研项目(2020RC32)

四川轻化工大学研究生创新基金(Y2022150)

四川轻化工大学研究生课程建设项目(AL202213)

出版年

2024
中国酿造
中国调味品协会 北京食品科学研究院

中国酿造

CSTPCD北大核心
影响因子:0.759
ISSN:0254-5071
参考文献量16
段落导航相关论文