首页|基于生物信息学和机器学习对类风湿关节炎关键基因的筛选

基于生物信息学和机器学习对类风湿关节炎关键基因的筛选

扫码查看
目的 对类风湿关节炎(rheumatoid arthritis,RA)的基因数据集进行生物信息学分析及机器学习,筛选出相关潜在的诊断及治疗靶点的关键基因.方法 通过获取RA相关数据集,筛选差异表达基因(differentially expressed genes,DEGs).通过最小绝对收缩与选择算子(least absolute shrinkage and selection operator,LASSO)和支持向量机递归特征消除(multiple support vector machine recursive feature elimination,mSVM-RFE)两种机器学习算法筛选关键基因,并绘制受试者工作特征(receiver operating characteristic,ROC)曲线以评价关键基因作为诊断及治疗靶点的潜在价值.结果 两个数据库筛选得到377个DEGs,其中上调基因266个,下调基因111个.通过两种机器学习算法筛选得到6个关键基因:HCP5、LRRC15、MREG、SDC1、SLC26A10和SNX10.ROC曲线分析显示,训练集中上述6个关键基因诊断 RA 的曲线下面积(area under the curve,AUC)依次为 0.959、0.945、0.878、0.929、0.882、0.903,均大于 0.8,验证集中上述6个关键基因AUC依次为0.821、0.912、0.971、0.997、0.671、0.894,除SLC26A10基因外均大于0.8,说明HCP5、LRRC15、MREG、SDC1、SLC26A10和SNX10 6个关键基因均对RA具有较高诊断价值.结论 通过生物信息学及机器学习方法分析获得的关键基因可能是RA潜在诊断标志物及精准治疗靶点.
Identification of Key Genes in Rheumatoid Arthritis Based on Bioinformatics and Machine Learning
Objective To bioinformatics analyze and machine learn the genetic datasets of rheumatoid arthritis(RA),and screen out the potential key genes related to diagnosis and therapeutic targets.Methods The RA related data sets were obtained to screen differentially expressed genes(DEGs).Least absolute shrinkage and selection operator(LASSO)and multiple support vector machine recursive feature elimination(mSVM-RFE)were applied to screen key genes,and the receiver operating characteristic(ROC)curve of key genes was drawn to evaluate the potential value of key genes as diagnostic and therapeutic targets.Results A total of 377 DEGs were screened from the two databases,inclu-ding 266 up-regulated genes and 111 down-regulated genes.Six key genes were identified by two machine learning algo-rithms:HCP5,LRRC15,MREG,SDC1,SLC26A10 and SNX10.ROC curve analysis showed that area under the curve(AUC)of the six key genes above for RA diagnosis in the training set were 0.959,0.945,0.878,0.929,0.882,0.903,all above 0.8.The AUC of the six key genes above in the validation set were 0.821,0.912,0.971,0.997,0.671 and 0.894 respectively,which were all greater than 0.8 except for SLC26A10 gene,indicating that all the six key genes above had high diagnostic value for RA.Conclusion The key genes obtained by bioinformatics analysis and machine learning al-gorithms may be potential diagnostic markers and precision treatment targets for RA.

Rheumatoid arthritisBioinformaticsMachine learningKey gene

刘华荆、刘莹、丁进亚

展开 >

430070 湖北武汉,中部战区总医院博士后工作站

中枢神经系统肿瘤发生与干预湖北省重点实验室

430070 湖北武汉,中部战区总医院检验科

类风湿关节炎 生物信息学 机器学习 关键基因

湖北省重点实验室开放基金中部战区总医院博士后科研启动基金

ZZYKF20221020211015KY17

2024

华南国防医学杂志
广州军区医学科学技术委员会

华南国防医学杂志

CSTPCD
影响因子:0.748
ISSN:1009-2595
年,卷(期):2024.38(4)
  • 22