首页|波长约束与非约束型变量选择方法对定量建模结果的影响

波长约束与非约束型变量选择方法对定量建模结果的影响

扫码查看
近红外光谱中多组分信息高度重叠的特点,决定了变量选择在构建稳健近红外定性定量模型中的特殊重要性.以8个产区、4个部位,共计655个烟叶样本中的总糖、还原糖、总氮、钾、氯和烟碱6个指标为对象,深入研究波长区间约束与 3 个典型非约束型变量选择方法对定量建模结果的影响,探索近红外光谱在完全基于数学与统计方法建模,以及加入约束特征波长后的结果差异性,并与全波长与全特征波长区间建模的差异性进行比较,发现针对目标数据,131个外部验证数据的偏最小二乘(partial least squares,PLS)的Q2 值的变异系数(coefficient of variation,CV)均在3%以内,而所选择的变量和波长区间存在较大的差异性.这些结果充分说明,基于近红外数据"二次分析"模型构建,存在其内在模型效果"瓶颈"与变量间的协同规律性,过度的变量选择算法与建模分析并不一定能较好地提升模型质量、改善预测分析结果,反而极大地降低模型的化学可解释性.
Influence of Constrained and Unconstrained Wavelength Selection Methods on Performance of Quantitative Modeling
The high overlapping of multi-component information in near-infrared spectroscopy(NIR)determines the great significance of variable selection in constructing robust qualitative and quantitative models.In this work,6 chemicals including total sugar,reducing sugar,total nitrogen,potassium,chlorine,and nicotine of 655 tobacco leaf samples were applied as examples for NIR modeling,which were collected from 8 planting regions,and 4 locations.The impact of wavelength interval constraints and 3 typical unconstrained variable selection methods were deeply studied for quantitative modeling.It helped to explore the differences of the results of NIR modeling on the basis of entirely mathematical and statistical methods,as well as compulsorily adding constrained characteristic wavelengths.Compared with the differences between full wavelength and full characteristic wavelength interval for modeling,it was found that the coefficient of variation(CV)of the Q2 values obtained from partial least squares(PLS)of 131 external validation data was within 3%,while the selected variables and wavelength interval had significant differences.These results fully demonstrate that the construction of a"secondary analysis"model for NIR data has its inherent"bottleneck"of the effectiveness of models,and synergistic effects between a large amount of variables.The excessive use of variable selection algorithms,and then modeling analysis cannot necessarily improve model quality and prediction powerfulness,but may greatly reduce the chemical interpretability of NIR models.

near-infrared spectroscopyvariable selectionwavelength constraintrobustnesschemometrics

张翼鹏、颜克亮、唐丽、文里梁、姜慧、陈爱明、逄涛、杨乾栩、朱保昆、曾仲大

展开 >

云南中烟工业有限责任公司,云南昆明 650231

大连达硕信息技术有限公司,辽宁大连 116023

云南省烟草农业科学研究院,云南玉溪 653100

大连大学环境与化学工程学院,辽宁大连 116622

展开 >

近红外光谱 变量选择 波长约束 稳健性 化学计量学

云南中烟工业有限责任公司科技项目

2022CP02

2024

香料香精化妆品
上海香料研究所

香料香精化妆品

CSTPCD
影响因子:0.341
ISSN:1000-4475
年,卷(期):2024.(3)