Stage prediction of lung adenocarcinoma based on deep learning and multi-omics data
To improve accuracy in decision-making in cancer staging,this study integrated three kinds of omics data,including messenger ribonucleic acid(mRNA)transcript data,micro ribonucleic acid(miRNA)transcript data and DNA methylation,from 452 lung adenocarcinoma patients,and used random forest algorithm to predict stages.First,three kinds of omics data obtained from the cancer genome altas(TCGA)database were preprocessed and the mRNA sequencing data were matched up with DNA methylation data at gene loci,then four different multi-omics integration strategies were adopted to integrate the preprocessed data,and finally a random forest algorithm was applied to the integrated data for the prediction of staging,and accuracy,Kappa coefficient and the area under the curve(AUC)were used to evaluate the performance of the prediction.The results show that adoption of the multi-omics integration strategies can achieve high accuracy.The integration strategy based on deep learning is considered as the most effective one,with accuracy,Kappa coefficient and AUC values of 0.940,0.931 and 0.986,respectively,and it can offer relevant guidance for the lung adenocarcinoma staging prediction in the future.
staging of lung adenocarcinomadeep learningintegration strategyrandom forest algorithm