首页|基于贝叶斯的分布式数据库高维数据治理方法

基于贝叶斯的分布式数据库高维数据治理方法

扫码查看
分布式数据库高维数据治理能力不足会提高网络传输代价及网络额外开销,为了快速识别异常数据、及时检索目标数据、精准预测目标问题,提出基于贝叶斯的分布式数据库高维数据治理方法,该方法首先利用切分因子删除分布式数据库中的无效数据块,降低任务执行途中高维数据迁移量,然后利用贝叶斯方法从先验分布和后验分布两种角度选择结局变量相关性和起始变量相关性均较强的变量,最后在分布式数据库对应的特定场景的给定条件下,确定变量的点落位置,完成分布式环境下高维数据控制图设计,实现分布式数据库高维数据治理.实验结果表明,所提方法治理效果好.
High-dimensional data governance method of distributed database based on Bayes
The lack of high-dimensional data governance capability of distributed database will increase the network transmission cost and additional network overhead.In order to quickly identify abnormal data,timely retrieve target data,and accurately predict target problems,a Bayesian high-dimensional data governance method of distributed database is proposed.This method first uses the segmentation factor to delete invalid data blocks in the distributed database and reduce the amount of high-dimensional data migration during the task execution,Then use Bayesian method to select variables with strong correlation between outcome variables and starting variables from two perspectives of prior distribution and posterior distribution.Finally,under the given conditions of the specific sce-nario corresponding to the distributed database,determine the location of the variables,complete the design of high-dimensional data control chart in the distributed environment,and achieve high-dimensional data governance in the distributed database.The experi-mental results show that the proposed method is effective.

distributed databasehigh-dimensional dataprunebayescontrol chart

张立冬、潘伟、刘敏

展开 >

湖北中烟工业有限责任公司,武汉 430040

分布式数据库 高维数据 剪枝 贝叶斯 控制图

企业数字化转型信息化架构研究项目

2020420000340014

2024

自动化与仪器仪表
重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD
影响因子:0.327
ISSN:1001-9227
年,卷(期):2024.(7)