首页|基于变分自编码器的近似聚合查询优化方法

基于变分自编码器的近似聚合查询优化方法

扫码查看
针对偏态数据分布不平衡,传统近似聚合查询方法难以抽样生成偏态分布数据的问题,提出基于优化的变分自编码器的近似聚合查询方法,研究近似聚合查询方法对偏态分布数据的近似聚合查询准确率的影响。在预处理阶段对偏态分布数据进行分层分组,对变分自编码器生成模型的网络结构和损失函数进行优化,降低近似聚合查询相对误差。实验结果表明,与基准方法相比,近似聚合查询对偏态分布数据的查询相对误差更小,且随着偏态系数的提高,查询相对误差的上升趋势更平缓。
Optimization method of approximate aggregate query based on variational auto-encoder
An optimized variational self-encoder-based approximate aggregation query method was proposed for the problem of imbalanced distribution of biased data,which makes it difficult to sample biased distribution data with traditional approximate aggregation query methods.The effect of approximate aggregation query method on the accuracy of approximate aggregation query for biased distribution data was analyzed.The bias-distributed data were hierarchically grouped in the preprocessing stage,and the network structure and loss function of the variational self-encoder generation model were optimized to reduce the approximate aggregated query relative error.The experimental results show that the query relative error of the approximate aggregation query is smaller for skewness distribution data compared with the benchmark method,and the rising trend of the query relative error is smoother as the skewness coefficient increases.

approximate query processingskewness distributionmachine learningvariational auto-encodergroup sampling

黄龙森、房俊、周云亮、郭志城

展开 >

北方工业大学信息学院,北京 100144

北方工业大学大规模流数据集成与分析技术北京市重点实验室,北京 100144

近似查询处理 偏态分布 机器学习 变分自编码器 分组抽样

国家自然科学基金国际(地区)合作与交流项目国家自然科学基金重点项目

6206113600661832004

2024

浙江大学学报(工学版)
浙江大学

浙江大学学报(工学版)

CSTPCD北大核心
影响因子:0.625
ISSN:1008-973X
年,卷(期):2024.58(5)
  • 26