Research on Single Cell Feature Extraction Based on Fusion Data Diffusion Algorithm and Deep Generation Model
Deep learning models in single-cell transcriptome sequencing(scRNA-seq)enable the extraction of gene expression features at a single-cell resolution.However,the presence of"dropout"issues during scRNA-seq data collection introduces significant technical zero values,resulting in noisy data in the gene expression matrix.This noise can obscure or impact the correlation between certain genes.Blindly mining noisy data often has detrimental effects on the training and inference processes of deep learning models,leading to problems such as batch effects,false differential gene expression results,and decreased performance,thereby concealing genuine ex-pression relationships.To tackle these challenges,this paper introduces a deep generative model that integrates a single-cell transcrip-tome data diffusion algorithm.By utilizing a data diffusion method to exchange information among similar cells,this approach aims to e-liminate noise in the cell count matrix and impute"dropout"events.Consequently,it enhances the clustering accuracy of deep models and effectively mitigates batch effects.