Metabolomics Combined with Machine Learning LASSO Regression to Identify Differential Biomarkers between Taigu Yam and Tiegun Yam
安莉 1周娟 1马婧玮 1陈贺 1王飞 2梁慧珍 3郝晨宇 4吴绪金1
扫码查看
点击上方二维码区域,可以放大扫码查看
作者信息
1. 河南省农业科学院农产品质量安全研究所,河南郑州 450002
2. 河南省农业科学院植物保护研究所,河南郑州 450002
3. 河南省农业科学院中药材研究所,河南郑州 450002
4. 天津中医药大学公共卫生与健康科学学院,天津 301617
折叠
摘要
本研究旨在利用代谢组学筛选出太古山药和铁棍山药之间的差异代谢物,并通过LASSO回归机器学习方法确定作为预测不同山药品种的差异标志物.研究采用超高效液相色谱-四级杆飞行时间串联质谱(ultra-performance liquid chromatography-quadrupole time of flight tandem mass spectrometry,UPLC-Q-TOF-MS/MS)分析两种山药中的代谢物,通过主成分分析(principal component analysis,PC A)和正交偏最小二乘判别分析(orthogonal partial least squares-discriminant analysis,OPLS-DA)识别出两种山药中的差异代谢物,利用最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归方法筛选出差异性标志物,建立用于品种鉴别的预测模型.结果显示,在两种山药中共鉴别出206种代谢物,PCA分析发现太谷山药和铁棍山药之间区分明显,OPLS-DA进一步筛选出56种存在显著性差异的代谢物.基于这些差异代谢物进行LASSO回归分析,得到 ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside、天冬氨酸、表儿茶素没食子酸酯、夏佛塔苷以及没食子儿茶素5种关键差异标志物,建立了用于太谷山药和铁棍山药品种鉴别的LASSO回归预测模型.本研究基于代谢组学和LASSO回归机器学习方法,识别出太谷山药和铁棍山药的差异标志物,构建了不同品种山药的预测模型,为山药的鉴别提供了新的思路.
Abstract
This study aimed to screen out the differential metabolites between Taigu yam and Tiegun yam by metabolomics approach,and determine differential markers for predicting different yam varieties through the east absolute shrinkage and selection operator(LASSO)regression method.Ultra-performance liquid chromatography-quadrupole time of flight tandem mass spectrometry(UPLC-Q-TOF-MS/MS)was employed to analyze metabolites in two types of yams.Principal component analysis(PCA)and orthogonal partial least squares discriminant analysis(OPLS-DA)were applied to identify distinct metabolites in these two yam varieties.Additionally,the LASSO regression method was used to screen out differential markers and establish a prediction model for variety identification.The results showed that a total of 206 metabolites were identified in the two yams.PCA found that Taigu yam and Tiegun yam were clearly distinguished.OPLS-DA further screened out 56 differential metabolites.LASSO regression analysis was performed based on these differential metabolites,and five differential markers were obtained including ophiogenin 3-O-beta-L-rhamnopyranosyl-beta-D-glucopyranoside,aspartic acid,epicatechin gallate,schaftoside and gallocatechin.These differential markers were used to establish a LASSO regression prediction model for identification of Taigu yam and Tiegun yam varieties.Based on metabolomics and LASSO regression methods,this study identified differential markers between Taigu yam and Tiegun yam,constructed a prediction model to identify different yam varieties,and would provide new ideas for the identification of yam.