首页|基于生物信息学分析及机器学习方法筛选早发型子痫前期的特征基因及相关免疫细胞浸润分析

基于生物信息学分析及机器学习方法筛选早发型子痫前期的特征基因及相关免疫细胞浸润分析

扫码查看
目的 通过生物信息学分析及机器学习方法探索早发型子痫前期(early-onset pre-eclampsia,EOSP)的特征基因及相关免疫细胞浸润分析.方法 在基因表达综合(Gene Expression Omnibus,GEO)数据库中,以"early-onset pre-eclampsia"为检索词,检索 EOSP 与正常妊娠的胎盘组织mRNA序列.采用R语言对芯片数据进行背景校正、标准化、汇总和探针质量控制,下载注释包进行ID转换,提取表达矩阵,使用limma软件包分析去除批次效应后的元数据中EOSP与正常妊娠之间差异表达基因(differentially expressed genes,DEGs).通过支持向量机递归特征消除(support vector machine-recursive feature elimination,SVM-RFE)分析和 LASSO 回归模型识别特征基因.通过计算受试者工作特征曲线的曲线下面积(area under the curve,AUC)分析特征基因的诊断能力.然后回顾性收集2022年1月1日至2023年2月28日在首都医科大学附属北京妇产医院分娩的15例EOSP及15例正常妊娠孕产妇的胎盘组织,应用实时荧光定量聚合酶链反应和蛋白质印迹法验证特征基因的表达情况,并在验证集中进一步验证.最后,使用CIBERSORT分析EOSP中免疫细胞浸润的相对比例.组间差异分析采用t检验.结果 共检索获得3个基因数据集,包括GSE44711(EOSP与正常妊娠各8例)、GSE74341(EOSP与正常妊娠分别为7例和5例)及GSE190639(EOSP与正常妊娠各13例),合并GSE44711和GSE74341数据集后共筛选出了 29个DEGs,其中包括27个上调及2个下调的基因.GO富集分析结果显示这29个DEGs主要参与促性腺激素分泌、女性妊娠、调控内分泌过程、内分泌激素分泌及激素分泌的负调节过程.通过LASSO回归算法及SVM-RFE算法联合分析共筛选出8个特征基因,分别为EBI3、HTRA 4、TREML2、TREM1、NTRK2、ANKRD37、CST6及ARMS2;定量逆转录聚合酶链反应和蛋白质印迹法验证特征基因的表达差异均有统计学意义(P值均<0.05,CST6除外).Logistic回归分析结果显示,TREML2、ANKRD37、NTRK2、TREM1、HTRA4、EBI3 及 ARMS2 的 AUC(95%CI)分别为 0.979(0.918~1.000)、0.969(0.897~1.000)、0.969(0.892~1.000)、0.979(0.918~1.000)、0.990(0.954~1.000)、0.990(0.954~1.000)、0.903(0.764~1.000).免疫细胞浸润结果显示EOSP胎盘组织中的M2巨噬细胞的浸润比例显著低于对照组(0.167±0.074与0.462±0.091,P=0.002),但单核细胞和嗜酸性粒细胞的浸润比例明显高于对照组(0.201±0.004 与 0.085±0.006,0.031±0.001 与 0.001±0.000,P值均<0.05);特征基因与浸润性免疫细胞之间的相关性分析结果显示7个特征基因与免疫细胞之间密切相关(P值均<0.05).结论 通过生物信息学分析及机器学习方法筛选出7个对于EOSP的早期诊断具有重要意义的特征基因,为后续子痫前期的预防及治疗提供了新的研究靶点和依据.
Screening of characteristic genes in early-onset pre-eclampsia and analysis of their association with immune cell infiltration based on bioinformatics analysis and machine-learning algorithms
Objective To screen the characteristic genes of early-onset pre-eclampsia(EOSP)and to analyze their association with immune cell infiltration based on bioinformatics analysis and machine learning methods.Methods In the Gene Expression Omnibus(GEO)database,the mRNA sequences of placental tissues from women with EOSP and normal pregnancy were retrieved using the term"early-onset pre-eclampsia".The R language was used for background correction,standardization,summarization,and probe quality control.Annotation packages were downloaded for ID conversion and the expression matrices were extracted.The differentially expressed genes(DEGs)between the EOSP and the normal pregnancy in the metadata were analyzed after correcting for batch effects using the limma package.Characteristic genes were identified through the support vector machine(SVM)-recursive feature elimination(RFE)method and the LASSO regression model.The area under the curve(AUC)was calculated to judge the diagnostic efficiency of the characteristic genes.Placental tissues were retrospectively collected for verification from 15 patients with EOSP and 15 with normal pregnancy who were delivered at Beijing Obstetrics and Gynecology Hospital,Capital Medical University from January 1,2022,to February 28,2023.The expression of characteristic genes was verified using quantitative real-time polymerase chain reaction(qRT-PCR)and Western blot,which were further validated in the validation dataset.Finally,the CIBERSORT algorithm was used to analyze the relative proportion of infiltrating immune cell in EOSP.A t-test was used for differential analysis.Results Three gene datasets were downloaded,including GSE44711(eight cases each for EOSP and normal pregnancy),GSE74341(seven cases for EOSP and five cases for normal pregnancy),and GSE190639(13 cases each for EOSP and normal pregnancy).A total of 29 DEGs were screened after combining the GSE44711 and GSE74341 datasets,including 27 upregulated and two downregulated genes.Gene ontology enrichment analysis showed that these genes are mainly involved in the secretion of gonadotropins,female pregnancy,regulation of endocrine processes,secretion of endocrine hormones,and negative regulation of hormone secretion.Eight characteristic genes(EBI3,HTRA4,TREML2,TREM1,NTRK2,ANKRD37,CST6,and ARMS2)were screened using the LASSO regression algorithm combined with SVM-RFE algorithm and the expression differences of these characteristic genes were verified as statistically significant by qRT-PCR and Western blot(all P<0.05,except for CST6).Logistic regression algorithm showed that the AUC(95%CI)of TREML2,ANKRD37,NTRK2,TREM1,HTRA4,EBI3,and ARMS2 were 0.979(0.918-1.000),0.969(0.897-1.000),0.969(0.892-1.000),0.979(0.918-1.000),0.990(0.954-1.000),0.990(0.954-1.000),and 0.903(0.764-1.000).Immune cell infiltration analysis indicated that the infiltration ratio of M2 macrophages in the placental tissue from EOSP was significantly lower than that in the normal pregnancy(0.167±0.074 vs.0.462±0.091,P=0.002),but the infiltration ratios of monocytes and eosinophils were significantly higher(0.201±0.004 vs.0.085±0.006;0.031±0.001 vs.0.001±0.000,both P<0.05).The correlation analysis between characteristic genes and infiltrating immune cells found that the seven characteristic genes were closely related to the immune cells(all P<0.05).Conclusion Seven characteristic genes that are critical for the prediction and early diagnosis of EOSP are screened using bioinformatics analysis and machine-learning algorithms in this study,which provides new research targets and a basis for the prevention and treatment of preeclampsia in the future.

Pre-eclampsiaComputational biologyMachine learningGene expressionCellular microenvironmentMacrophages

武紫彤、郑媛媛、丁新

展开 >

首都医科大学附属北京妇产医院(北京妇幼保健院)产科,北京 100026

早发型子痫前期 计算生物学 机器学习 基因表达 细胞微环境 巨噬细胞

2024

中华围产医学杂志
中华医学会

中华围产医学杂志

CSTPCD北大核心
影响因子:1.438
ISSN:1007-9408
年,卷(期):2024.27(1)
  • 24