首页|SDAEC算法在单细胞测序数据批次校正中的应用

SDAEC算法在单细胞测序数据批次校正中的应用

扫码查看
目的 提出深度堆叠降噪自编码嵌套聚类(stacked denoising auto encoder embedded cluster,SDAEC)算法并用于单细胞mRNA测序(single cell mRNA sequence,scRNA-seq)数据的批次效应移除,对其移除批次效应性能进行评估.方法 基于单细胞数据具有高维度、高稀疏性及高度非线性误差特点,通过将单细胞Louvain聚类算法嵌入堆叠降噪自动编码器(stacked denoising auto encoder,SDAE)算法中,形成SDAEC算法,用于单细胞测序数据的批次效应移除.结合实际卵巢癌组织scRNA-seq数据,利用分布邻域嵌入(t-distributed stochastic neighbor embedding,tSNE)、k最近邻批次效应检测(k-nearest-neighbor batch-effect test,kBET)、调整兰德系数(adjusted rand index,ARI)、标准化互信息(normal-ized mutual information,NMI)、平均轮廓宽度(average silhouette width,ASW)评价其移除批次效应性能.结果 利用SDAEC方法 对scRNA-seq数据批次效应移除性能高于Combat、相互最近邻(mutual nearest neighbors,MNN)、分布匹配残差网络(maximum mean discrepancy distribution-matching residual networks,MMD-ResNet)和基于零膨胀负二项的方差提取法(zero-inflated negative binomial-based wanted variation extraction,ZINB-WaVE).结论 SDAEC算法能够移除scRNA-seq数据的批次效应,提高scRNA-seq数据下游分析的有效性,具有实际应用价值.
SDAEC Method and its Application in Batch Effect Removal for Single Cell mRNA Sequence
Objective To propose a deep stacked denoising auto encoder embedded cluster(SDAEC)algorithm and apply it to single cell mRNA sequence(scRNA-seq)data to remove the batch effect,and further to evaluate the performance of its batch effect removal.Methods Based on the characteristics of high dimension,high sparsity and high non-linear error of single-cell data,the algorithm of single cell Louvain clustering was embedded into stacked denoising auto encoder(SDAE)algorithm,and formed a SDAEC algorithm,which was used to batch effect removal for scRNA-seq data.SDAEC algorithm was utilized to scRNA-seq data of actual ovarian cancer tissue for batch effect removal,t-distributed stochastic neighbor embedding(tSNE),k-nearest-neighbor batch-effect test(kBET),adjusted rand index(ARI),normalized mutual information(NMI)and average silhouette width(ASW)were used to evaluate the performance of removing batch effect.Results The performance of SDAEC was better than Combat,mutual nearest neighbors(MNN),maximum mean discrepancy distribution-matching residual networks(MMD-ResNet)and zero-inflated negative binomial-based wanted variation extraction(ZINB-WaVE)in removing batch effect of scRNA-seq.Conclusion SDAEC algorithm can remove the batch effect of scRNA-seq data and improve the validity of downstream analysis of scRNA-seq data.

Stacked denoising auto encoder embedded clusterSingle cell mRNA sequenceBatch effectsOvarian cancer

王文杰、李康、谢宏宇

展开 >

哈尔滨医科大学卫生统计学教研室(150081)

浙江大学医学院附属妇产科医院临床研究中心

深度堆叠降噪自编码嵌套聚类 单细胞测序 批次效应 卵巢癌

国家自然科学基金浙江省自然科学基金

82003551LT-GY24H160008

2024

中国卫生统计
中国卫生信息学会 中国医科大学

中国卫生统计

CSTPCD北大核心
影响因子:1.172
ISSN:1002-3674
年,卷(期):2024.41(4)