首页|基于生物信息分析的结直肠癌表达和预后的关键基因筛选

基于生物信息分析的结直肠癌表达和预后的关键基因筛选

扫码查看
目的 识别结直肠癌中的差异表达基因(DEGs),探索结直肠癌的关键通路和基因。方法 选取来自基因表达综合数据集的基因表达谱GSE211496、GSE6988和GSE29900数据集,使用GEO2R分析工具进行分析并下载相关数据,通过在线数据库miRDB和TargetScan对GSE29900数据集的差异miRNAs进行靶基因预测,然后使用韦恩图取3个数据库的DEGs交集,使用DAVID数据库工具进行GO和KEGG通路富集分析,接着使用PPI进行网络构建并由Cytoscape软件进行可视化,使用TCGA数据库验证Hub基因表达,使用pROC包对与TCGA数据库表达一致的Hub基因进行ROC曲线分析,最后,利用Kaplan-Meier绘图仪在线工具对结直肠癌患者进行预后分析。结果 筛选出GSE211496数据集2 570个DEGs(p。adj<0。01且|log2FC|≥1),GSE6988 数据集 406 个 DEGs(p。adj<0。01 且 |log2FC|≥1)和 GSE29900 数据集 99 个差异表达miRNA(p。adj<0。01且|log2FC|≥1),预测出差异表达miRNAs的靶基因14 938个,将靶基因与DEGs重叠共获得30个目标基因。KEGG通路结果显示,目标基因主要富集于血管平滑肌收缩和矿物吸收通路。通过连接度从PPI网络中筛选出前10个Hub基因;Hub基因经TCGA数据库验证,发现MYL9、ACTG2、AGT和PDGFRA与GSE211496数据集表达一致。分析这4个Hub基因对结直肠癌的诊断情况发现,基因AGT(AUC=0。901,95%CI 0。868~0。933)与预测结直肠癌的发生呈正相关,基因 MYL9(AUC=0。820,95%CI 0。757~0。884)、ACTG2(AUC=0。855,95%CI 0。802~0。908)和 PDGFRA(AUC=0。815,95%CI 0。772~0。858)与预测结直肠癌的发生呈负相关。基因MYL9、ACTG2和PDGFRA对结直肠癌诊断均有一定准确性,基因AGT对结直肠癌诊断具有较高准确性。Kaplan-Meier生存分析发现,PDGFRA、ACTG2和 MYL9低表达均显示患者预后较好,差异均有统计学意义(P<0。05)。结论 该研究通过生物信息学分析筛选并鉴定出4个基因是结直肠癌中的枢纽基因,这些基因包括PDGFRA、ACTG2、MYL9和AGT,这将为结直肠癌研究提供一些新方向。
Key gene screening for colorectal cancer expression and prognosis based on bioinformatics analysis
Objective To identify differentially expressed genes(DEGs)in colorectal cancer and explore key pathways and genes in colorectal cancer due to the relatively high incidence of colorectal cancer.Methods The gene expression profiles GSE211496,GSE6988 and GSE29900 data sets from the gene expres-sion comprehensive data set were selected,and the GEO2R analysis tool was used to analyze and download the relevant data.The online database miRDB and TargetScan were used to predict the target genes of the differ-ential miRNAs in the GSE29900 data set.Then,the DEGs intersection of the three databases was taken by Wayne diagram,and the DAVID database tool was used for Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analysis.Then Protein-Protein Interactions(PPI)protein interaction tool was used for network construction and visualization by Cytoscape software.Hub gene expres-sion was verified by The Cancer Genome Atlas(TCGA)database.Receiver operating characteristic(ROC)curve analysis was performed on Hub genes consistent with TCGA database expression using pROC package.Finally,Kaplan-Meier plotter online tool detection was used to analyze the prognosis of colorectal cancer pa-tients.Results A total of 2 570 DEGs(p.adj<0.01 and | log2FC| ≥ 1)in GSE211496 dataset,406 DEGs(p.adj<0.01 and|log2 FC |≥ 1)in GSE6988 dataset and 99 differentially expressed miRNAs(p.adj<0.01 and |log2 FC | ≥1)in GSE29900 dataset were screened out,and 14 938 target genes of differentially expressed miR-NAs were predicted.A total of 30 target genes were obtained by overlapping the target genes with DEGs.The results of KEGG pathway showed that the target genes were mainly enriched in vascular smooth muscle con-traction and mineral absorption pathways.The top 10 Hub genes were screened from the PPI network by con-nectivity.The TCGA database of Hub genes verified that MYL9,ACTG2,AGT and PDGFRA were consistent with the expression of GSE211496 data set.Analysis of the four Hub genes for the diagnosis of colorectal cancer showed that gene AGT(AUC=0.901,95%CI 0.868-0.933)was positively correlated with the pre-diction of colorectal cancer,while gene MYL9(AUC=0.820,95%CI 0.757-0.884),gene ACTG2(AUC=0.855,95%CI 0.802-0.908)and gene PDGFRA(AUC=0.815,95%CI 0.772-0.858)were negatively corre-lated with the prediction of colorectal cancer.Gene MYL9,ACTG2 and PDGFRA had certain accuracy in the diagnosis of colorectal cancer,and gene AGT had high accuracy in the diagnosis of colorectal cancer.Kaplan-Meier survival analysis showed that low expression of PDGFRA,ACTG2 and MYL9 indicated better progno-sis,and the differences were statistically significant(P<0.05).Conclusion In this study,four genes were screened and identified as hub genes in colorectal cancer by bioinformatics analysis.These genes include PDG-FRA,ACTG2,MYL9 and AGT,which will provide some new directions for colorectal cancer research.

Colorectal cancerBioinformaticsDifferentially expressed genesReceiver operating characteristic curve analysisSurvival analysis

陈晓玲、杨娟、吕敏敏、贾亦真

展开 >

香港大学深圳医院感染性疾病医学部,广东 深圳 518000

深圳市坪山区妇幼保健院检验科,广东 深圳 518000

香港大学深圳医院中心实验室,广东 深圳 518000

香港大学深圳医院科研部临床研究管理办公室,广东 深圳 518000

展开 >

结直肠癌 生物信息学 差异表达基因 ROC曲线分析 生存分析

2024

现代医药卫生
重庆市卫生信息中心

现代医药卫生

影响因子:0.758
ISSN:1009-5519
年,卷(期):2024.40(7)
  • 32