Research on Automatic Desensitization Algorithm of Power Consumption Data Based on KL-divergence
Power data contain some private data,which will cause hidden danger to personal privacy security once leaked.To ensure the security of power data,this paper proposes an automatic desensitization algorithm for power data based on KL-diver-gence.The sensitive data filtering model is established based on the KL-dispersion algorithm,and the KL distance of different variable data is calculated to obtain its similarity index.The user item score is smoothed,and the sensitive data with similarity is divided into different batches.The sensitive data are de-identified,the data are converted anonymously,and the probability of the user's real path being leaked is calculated.The automatic desensitization algorithm is designed to calculate the loss de-gree of conceptual data,tuple information and information flow respectively to determine whether the desensitized data are available.This paper checks the data consistency before and after desensitization.The change rates of the three types of electric power consumption data are 0.43%,0.14%and 0.11%,which are far less than the standard value.In addition,the amount of data processed per unit time and the average delay time of the algorithm are also relatively ideal during the operation,which shows that the desensitization algorithm is practical.