基于k-means的不同维度隐私数据差分聚类方法仿真
Simulation of Differential Clustering Method for Privacy Data of Different Dimensions Based on K-means
王丹辉 1冯青文 1侯惠芳2
作者信息
- 1. 郑州科技学院信息工程学院,河南 郑州 450064
- 2. 河南工业大学人工智能与大数据学院,河南 郑州 450001
- 折叠
摘要
不同维度隐私数据聚类时,若不能及时在聚类过程保护数据隐私,容易出现数据隐私泄露问题.为有效提升数据聚类时的隐私安全,提出一种基于k-means的不同维度隐私数据差分聚类方法.对数据杂乱噪声实施去除,利用CS重构算法重构出损伤数据原始信号,实现数据的损伤修复;针对k-means聚类算法聚类中心选取时数据易泄露问题,利用差分算法k-means聚类算法实施优化处理,结合拉普拉斯机制在初始聚类中心中加入有规律的人工规整噪声,有效提升数据的安全保护效果,实现数据的有效聚类.实验结果表明,使用上述方法开展隐私数据聚类时,聚类效果好、性能高.
Abstract
When clustering privacy data from different dimensions,if data privacy cannot be protected in a timely manner during the clustering process,it is easy to encounter data privacy leakage problems.To effectively enhance privacy and security during data clustering,a k-means based differential clustering method for privacy data in different dimensions was proposed.Firstly,the noise in data was removed.And then,CS reconstruction algorithm was used to construct the original signal of damaged data,thus repairing the data damage.To address the issue of data leakage when selecting the cluster center of k-means clustering algorithm,the differential k-means clustering algo-rithm was adopted for optimization.According to Laplacian mechanism,regularized artificial noise was added to the initial cluster centers to enhance the effect of data protection.Finally,we achieved the effective data clustering.The experimental results show that the proposed method has good clustering effect and high performance in private data clustering.
关键词
多维度/海量隐私数据/差分算法/聚类算法/数据去噪Key words
Multi-dimensional/Massive private data/Difference algorithm/Clustering algorithm/Data denoising引用本文复制引用
基金项目
郑州科技学院科研项目(2022XJKY01)
出版年
2024