Validation of Secure Multi-Party Computation in Credible Conjoint Analysis Platform of TCM Based on Real-World Data:Taking Descriptive Statistics and Frequency Analysis As Examples
Objective To validate the accuracy,stability and efficiency of secure multi-party computation in TCM criedible conjoint analysis platform by using real-world TCM data,including distributed descriptive statistics and distributed frequency a-nalysis.Methods The data of chronic gastritis patients in several top three hospitals in the real world was used for validation.The data was divided into multiple data sets after standardized processing and uploaded to each node in the platform.Four continuous variables(age,PG Ⅰ,PGⅡ,PG Ⅰ/PG Ⅱ)and three second categorical variables(syndrome element,tongue condition,pulse con-dition)were selected for descriptive statistical calculation and frequency analysis calculation,respectively.Single center compu-ting and distributed computing were carried out on the platform respectively,and centralized computing was carried out in SPSS V26.0,for distributed descriptive statistics,the data volume and the number of nodes were adjusted for three times of validation,and finally the three results were compared to evaluate the accuracy and stability of secure multi-party computation.T-test was used in Excel to compare the differences between computation times to evaluate efficiency.Results A total of 5160 cases were in-cluded.The computation results of distributed descriptive statistics for four continuous variables were completely consistent with the central results,and the results of multiple verifications were also the same.The computation results of some single centers were quite different from the total results.The distributed frequency analysis results of three categorical variable were also com-pletely consistent with the central results.The calculation time of distributed descriptive statistics was slightly longer than that of single center calculation,but there was no significant difference in all variables(age variable P=0.05,PG Ⅰ variable P=0.08,PG Ⅱ variable P=0.31,PG Ⅰ/PG Ⅱ variable P=0.19).The calculation time of distributed frequency analysis was similar to that of single center calculation,and there was no significant difference overall(P=0.32).However,the calculation time of fre-quency analysis was significantly lower than that of descriptive statistics(descriptive statistics:51-84 s,frequency analysis:28-40 s).Conclusion The distributed descriptive statistics and distributed frequency analysis in secure multi-party computation of the credible conjoint analysis platform can achieve sufficient levels of accuracy,stability and efficiency under the premise of strict protection of data security and can be used in actual multi center clinical research of traditional Chinese medicine.