基于深度学习和多组学数据的肺腺癌分期预测研究

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据
维普

中文摘要：为解决癌症分期难以精准决策这一问题,对452例肺腺癌患者的信使核糖核酸(mRNA)转录数据、微核糖核酸(miRNA)转录数据和DNA甲基化3种组学数据进行集成融合,并采用随机森林算法进行分期预测.首先对从癌症基因组图谱(TCGA)数据库获取的 3种组学数据进行预处理,将mRNA转录数据和DNA甲基化数据进行基因位点匹配,再使用4种不同的多组学集成策略对预处理后的组学数据进行集成,最后使用随机森林算法对集成后的数据进行分期预测并使用准确度、卡帕系数以及曲线下面积(AUC)作为预测效果的评价指标.研究结果显示,采用多组学集成策略在分期预测上具有更高的准确率,其中基于深度学习的集成策略的预测效果最好,评价指标分别为0.940、0.931和0.986,有希望应用于未来的肺腺癌分期预测中.

外文标题：Stage prediction of lung adenocarcinoma based on deep learning and multi-omics data

外文摘要：To improve accuracy in decision-making in cancer staging,this study integrated three kinds of omics data,including messenger ribonucleic acid(mRNA)transcript data,micro ribonucleic acid(miRNA)transcript data and DNA methylation,from 452 lung adenocarcinoma patients,and used random forest algorithm to predict stages.First,three kinds of omics data obtained from the cancer genome altas(TCGA)database were preprocessed and the mRNA sequencing data were matched up with DNA methylation data at gene loci,then four different multi-omics integration strategies were adopted to integrate the preprocessed data,and finally a random forest algorithm was applied to the integrated data for the prediction of staging,and accuracy,Kappa coefficient and the area under the curve(AUC)were used to evaluate the performance of the prediction.The results show that adoption of the multi-omics integration strategies can achieve high accuracy.The integration strategy based on deep learning is considered as the most effective one,with accuracy,Kappa coefficient and AUC values of 0.940,0.931 and 0.986,respectively,and it can offer relevant guidance for the lung adenocarcinoma staging prediction in the future.

外文关键词：

staging of lung adenocarcinomadeep learningintegration strategyrandom forest algorithm

作者：

刘德真、李圆媛

展开 >

作者单位：

武汉工程大学光电信息与能源工程学院、数理学院,湖北武汉 430205

关键词：

肺腺癌分期深度学习集成策略随机森林算法

基金：

国家自然科学基金

项目编号：

12001408

出版年：

2024

DOI：

10.19843/j.cnki.CN42-1779/TQ.202307022

武汉工程大学学报

武汉工程大学

武汉工程大学学报

影响因子：0.463

ISSN：1674-2869

年,卷(期)：2024.46(2)

参考文献量26