首页|基于近红外光谱结合数据增强CNN算法的白芷产地溯源方法

基于近红外光谱结合数据增强CNN算法的白芷产地溯源方法

Based on Near Infrared Spectroscopy Combined with Data Enhancement CNN Algorithm Origin Traceabil-ity Method of Angelica Dahurica

扫码查看
目的 在中药产地溯源领域,基于近红外光谱结合数据增强卷积神经网络(CNN)算法建立样本量不均衡的白芷产地分类模型具有很大的理论研究价值与实际应用价值.方法 研究采集95份白芷样本,采用12 500~4 000 cm-1波段对不同白芷样品进行近红外光谱采集.本研究所使用的白芷近红外光谱数据集,存在样本量小、样本产地类别分布不均衡等问题.本研究提出了 3种数据增强算法,包含光谱平移、光谱增噪和光谱组合来提升模型泛化能力,并使用Focal Loss作为损失函数来训练CNN模型解决样本不平衡的问题.结果 将3种数据增强算法应用于支持向量机(SVM)模型,对光谱数据添加信噪比为20的高斯噪声效果最好,能够将模型正确率提高至84.2%;在样本不平衡的情况下,通过应用Focal Loss作为损失函数来训练CNN模型,实现了高达94.7%的正确率.结论 通过红外光谱技术结合数据增强的CNN算法为白芷产地溯源提供了快速、无损的检测手段及可靠的数据分析方法,为中药材产地溯源提供新的方法参考.
OBJECTIVE To establish an origin classification model of Angelica dahurica with unbalanced sample size based on near-infrared spectroscopy combined with data-enhanced convolutional neural network(CNN)algorithm.METHODS In this study,95 samples of Angelica dahurica were collected,and near-infrared spectroscopy was performed on different samples within the wavelength range of 12 500 to 4 000 cm-1.The near-infrared spectroscopy dataset of Angelica dahurica used in this study faces issues such as small sample size and uneven distribution of sample origins.To enhance the generalizability of the model,three data augmentation algorithms were proposed,including spectral shifting,spectral noise addition,and spectral combination.Additionally,to address the problem of sample imbalance,Focal Loss was used as the loss function for training the CNN model.RESULTS The three data enhancement algorithms were applied to the SVM model.Adding Gaussian noise with a signal-to-noise ratio of 20 to the spectral data had the best effect,which could increase the accuracy of the model to 84.2%.Aiming at the problem of sample imbalance,Focal Loss is used as the loss function to train the CNN model,and the accuracy rate can reach 94.7%.CONCLUSION The infrared spectroscopy combined with data-enhanced CNN algorithm provides a rapid and non-destructive detec-tion method and reliable data analysis method for the origin traceability of Radix Angelicae Dahuricae,and provides a new method ref-erence for the origin traceability of Chinese medicinal materials.

near infrared spectroscopyAngelica dahuricaorigin traceabilitydata enhancementconvolutional neural network

郭兆华、文师召、李思凡、王琪、王颖鑫、王鑫国、牛丽颖、李亚薇、冯薇

展开 >

中国电子科技集团公司网络通信研究院微波散射通信专业部,石家庄 050050

南开大学统计与数据科学学院,天津 300192

东北大学理学院,沈阳 110167

河北中医药大学药学院,中药材品质评价与标准化河北省工程研究中心,石家庄 050091

辽宁省检验检测认证中心,辽宁省分析科学研究院,沈阳 110032

展开 >

近红外光谱 白芷 产地溯源 数据增强 卷积神经网络

2024

中国药学杂志
中国药学会

中国药学杂志

CSTPCD北大核心
影响因子:0.957
ISSN:1001-2494
年,卷(期):2024.59(21)