首页|基于云数据中心的多源异构数据治理技术研究

基于云数据中心的多源异构数据治理技术研究

扫码查看
目前常规的多源异构数据治理方法主要通过对数据属性进行判断,从而实现分区域数据清洗,由于缺乏对非线性数据的分析,导致治理性能不佳;对此,提出基于云数据中心的多源异构数据治理技术;采用关系型数据库中的ETL功能对数据进行清洗,对数据转换模式以及数据清洗规则进行定义;引入互信息系数对数据相关程度进行判定,并进行非线性数据相关性分析;以云数据中心作为载体,对多源异构数据治理体系进行构建;在实验中,对提出的数据治理技术进行了治理性能的检验;最终的实验结果表明,提出的数据治理技术具备较高的查准率,对云数据中心多源异构数据具备较为理想的数据治理效果。
Research on Multi-source Heterogeneous Data Governance Technology Based on Cloud Data Center
Currently,conventional multi-source heterogeneous data governance methods are mainly used to judge data attributes to achieve sub-regional data cleaning,which leads to poor governance performance due to the lack of non-linear data analysis.For this reason,a multi-source heterogeneous data governance technique based on cloud data center is proposed.The ETL function of relation-al database is adopted to clean the data,which defines the data transformation mode and data cleaning rules.The mutual information coefficient is introduced to determine the degree of data relevance,and analyze the data relevance analysis.The cloud data center is used as a carrier to construct the multi-source heterogeneous data governance system.In the experiments,the governance perform-ance of the proposed data governance technique is examined.The final test results show that the proposed data governance technique has a high checking accuracy rate and more ideal data governance effect.

cloud data centermulti-source heterogeneous datadata governancedata cleansing

孙瑜

展开 >

中国人民解放军92941部队45分队,辽宁 葫芦岛 125001

云数据中心 多源异构数据 数据治理 数据清洗

2024

计算机测量与控制
中国计算机自动测量与控制技术协会

计算机测量与控制

CSTPCD
影响因子:0.546
ISSN:1671-4598
年,卷(期):2024.32(3)
  • 24