首页|基于中医药真实世界数据验证中医可信联合分析平台中的安全多方计算:以描述性统计和频数分析为例

基于中医药真实世界数据验证中医可信联合分析平台中的安全多方计算:以描述性统计和频数分析为例

扫码查看
目的 利用真实世界中的中医药数据验证中医可信联合分析平台中安全多方计算的准确性、稳定性和效率,包括分布式描述性统计和分布式频数分析.方法 利用真实世界中多家三甲医院脾胃科慢性胃炎患者的数据进行验证,将数据进行规范化处理后拆分成多个数据集分别上传至平台中的各个节点,选取四个连续性变量(年龄、PG Ⅰ、PGⅡ、PGⅠ/PGⅡ)和三个二分类变量(证素、舌象、脉象)分别进行描述性统计计算和频数分析计算.在平台中分别进行单中心计算和分布式计算,在SPSS 26.0中进行中心式计算,对于分布式描述性统计,调整节点的数据量和节点数量进行三次验证,最后比较三种结果以评估安全多方计算的准确性、稳定性.在Excel中使用t检验比较计算时间之间的差异以评估效率.结果 最终纳入了 5160份数据.四个连续性变量的分布式描述性统计的计算结果与中心式结果完全一致,多次验证的结果也相同,部分单中心的计算结果与总结果存在较大差异,三个分类变量的分布式频数分析结果和中心式结果同样完全一致.分布式描述性统计计算时间略高于单中心计算时间,但各变量都无显著差异(年龄变量P=0.05,PGⅠ变量P=0.08,PGⅡ变量P=0.31,PGⅠ/PGⅡ变量P=0.19),分布式频数分析的计算时间与单中心计算时间相近,总体无显著差异(P=0.32),然而频数分析的计算时间明显低于描述统计(描述性统计:51~84 s,频数分析:28~40 s).结论 可信联合分析平台安全多方计算中的分布式描述性统计和分布式频数分析在严格保护数据安全性的前提下准确性、稳定性和效率能达到足够的水平,可以用于实际的多中心中医药临床研究.
Validation of Secure Multi-Party Computation in Credible Conjoint Analysis Platform of TCM Based on Real-World Data:Taking Descriptive Statistics and Frequency Analysis As Examples
Objective To validate the accuracy,stability and efficiency of secure multi-party computation in TCM criedible conjoint analysis platform by using real-world TCM data,including distributed descriptive statistics and distributed frequency a-nalysis.Methods The data of chronic gastritis patients in several top three hospitals in the real world was used for validation.The data was divided into multiple data sets after standardized processing and uploaded to each node in the platform.Four continuous variables(age,PG Ⅰ,PGⅡ,PG Ⅰ/PG Ⅱ)and three second categorical variables(syndrome element,tongue condition,pulse con-dition)were selected for descriptive statistical calculation and frequency analysis calculation,respectively.Single center compu-ting and distributed computing were carried out on the platform respectively,and centralized computing was carried out in SPSS V26.0,for distributed descriptive statistics,the data volume and the number of nodes were adjusted for three times of validation,and finally the three results were compared to evaluate the accuracy and stability of secure multi-party computation.T-test was used in Excel to compare the differences between computation times to evaluate efficiency.Results A total of 5160 cases were in-cluded.The computation results of distributed descriptive statistics for four continuous variables were completely consistent with the central results,and the results of multiple verifications were also the same.The computation results of some single centers were quite different from the total results.The distributed frequency analysis results of three categorical variable were also com-pletely consistent with the central results.The calculation time of distributed descriptive statistics was slightly longer than that of single center calculation,but there was no significant difference in all variables(age variable P=0.05,PG Ⅰ variable P=0.08,PG Ⅱ variable P=0.31,PG Ⅰ/PG Ⅱ variable P=0.19).The calculation time of distributed frequency analysis was similar to that of single center calculation,and there was no significant difference overall(P=0.32).However,the calculation time of fre-quency analysis was significantly lower than that of descriptive statistics(descriptive statistics:51-84 s,frequency analysis:28-40 s).Conclusion The distributed descriptive statistics and distributed frequency analysis in secure multi-party computation of the credible conjoint analysis platform can achieve sufficient levels of accuracy,stability and efficiency under the premise of strict protection of data security and can be used in actual multi center clinical research of traditional Chinese medicine.

secure multi-party computationdistributed computationdata sharingdata securitytraditional Chinese medi-cinecredible conjoint analysisdescriptive statisticsfrequency analysis

赵冉、张雯、唐旭东、张泽丹、何畅、温宵宵、王斌

展开 >

中国中医科学院中医药信息研究所,北京 100700

中国中医科学院中医药科技合作中心,北京 100700

中国中医科学院,北京 100700

中国中医科学院中医药数据中心,北京 100700

展开 >

安全多方计算 分布式计算 数据共享 数据安全 中医药 可信联合分析 描述性统计 频数分析

2024

中华中医药学刊
中华中医药学会 ,辽宁中医药大学

中华中医药学刊

CSTPCD北大核心
影响因子:1.007
ISSN:1673-7717
年,卷(期):2024.42(12)