内蒙古大学学报(自然科学版)2024,Vol.55Issue(1) :38-45.DOI:10.13484/j.nmgdxxbzk.20240106

基于生物信息学和机器学习筛选类风湿关节炎关键基因

Screening Key Genes for Rheumatoid Arthritis Based on Bioinformatics and Machine Learning

侯甜甜 董斐雅 董佳琪 张晓炜 刘暘 樊国梁
内蒙古大学学报(自然科学版)2024,Vol.55Issue(1) :38-45.DOI:10.13484/j.nmgdxxbzk.20240106

基于生物信息学和机器学习筛选类风湿关节炎关键基因

Screening Key Genes for Rheumatoid Arthritis Based on Bioinformatics and Machine Learning

侯甜甜 1董斐雅 1董佳琪 1张晓炜 2刘暘 2樊国梁1
扫码查看

作者信息

  • 1. 内蒙古大学物理科学与技术学院,呼和浩特 010021
  • 2. 内蒙古医科大学第一附属医院风湿免疫科,呼和浩特 010050
  • 折叠

摘要

类风湿性关节炎(RA)是一种器官特异性自身免疫疾病,特征在于慢性滑膜炎和骨质侵蚀,其高致残率会对社会和个人造成严重影响,因此需要有效可靠的RA诊断标志物和治疗靶点.本研究从GEO数据库下载了表达谱数据集和单细胞测序数据集,以阐明RA中潜在的候选基因和途径.首先,利用单细胞测序数据注释了 5类细胞簇,筛选出每个细胞簇与RA相关的基因4109个,进行了基因本体(GO)和京都基因和基因组百科全书(KEGG)途径富集分析.其次,利用生物信息学方法分析表达数据,共鉴定出677个与RA相关的基因,进行了差异表达基因的GSEA富集分析.最后,对差异表达基因(DEG)和每个簇的差异表达基因进行关联分析,再利用LASSO算法进一步分析,获得了 6个关键基因,分别是IGLL5、AIM1、NKG7、PSMB9、ANKRD11和BIRC3,这6个基因在训练集与验证集的ROC曲线显示其在类风湿关节炎中具有良好的诊断性能.

Abstract

Rheumatoid arthritis(RA)is an organ-specific autoimmune disease characterized by chronic synovitis and bone erosion.Its high rate of disability has serious implications for society and individuals,and there is a need for effective and reliable diagnostic markers and therapeutic targets for RA.In this study,expression profile datasets and single cell sequencing datasets were download-ed from the GEO database to elucidate potentially essential candidate genes and pathways in RA.Firstly,we annotated 5 types of cell clusters using single-cell sequencing data,and 4109 RA-related genes were screened in each cell cluster.Then gene ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analysis were conducted.Secondly,a total of 677 RA-related genes were identified by bioinformatics analysis of expression data,and GSEA enrich-ment analysis of differentially expressed genes was performed.Then,the differential expression genes(DEGs)and the differential expression gene association of each cluster were analyzed,and the machine learning algorithm was used for further analysis.Six key genes were obtained,namely IGLL5,AIM1,NKG7,PSMB9,ANKRD11 and BIRC3.Finally,ROC curves of these 6 genes in train-ing set and validation set showed that they had good diagnostic performance in rheumatoid arthritis.

关键词

类风湿关节炎/生物信息学/机器学习/差异表达基因/关键基因

Key words

rheumatoid arthritis/bioinformatics/machine learning/differentially expressed genes/key gene

引用本文复制引用

基金项目

国家自然科学基金(62063024)

国家自然科学基金(61461038)

中央引导地方科技发展资金项目(RZ2300000684)

内蒙古自治区高等学校科研项目(NJZY20005)

出版年

2024
内蒙古大学学报(自然科学版)
内蒙古大学

内蒙古大学学报(自然科学版)

CSTPCD
影响因子:0.346
ISSN:1000-1638
参考文献量25
段落导航相关论文