首页|油气工程多源信息数据K-medoids聚类集成研究

油气工程多源信息数据K-medoids聚类集成研究

扫码查看
油气工程领域的信息数据量庞大且源自多元渠道,数据分布广泛且质量参差不齐,直接整合所有数据点进行集成往往会导致信息矩阵质量退化,难以满足实际应用需求,提出基于K-medoids聚类的油气工程多源信息数据集成算法。首先,构建多源数据集,基于决策图选择多源数据代表点;然后基于最近邻近似原则混合代表策略,构建稀疏亲和子矩阵并进行稀疏化处理,结合最近代表快速近似方法获取油气工程多源信息数据的基聚类结果;最后,利用拉格朗日函数对基聚类后的结果赋权,计算聚类成本,完成油气工程多源信息数据的集成。通过实验证明:所提方法对数据集的平均迭代次数较低,CA始终保持在 96%以上,NMI值保持在0。94 以上,曲线平稳波动幅度较小,说明聚类集成准确性较高,效果较好。
Research on K-Medoids Clustering Integration of Multi-Source Information Data in Oil And Gas Engineering
The information data in the field of oil and gas engineering is vast and originates from multiple chan-nels.The data is widely distributed and of uneven quality.Directly integrating all data points for integration often leads to a degradation in the quality of the information matrix,making it difficult to meet practical application needs,so an oil and gas engineering multi-source information data integration algorithm based on K-medoids clustering is proposed.First,a multi-source data set is built and representative points of multi-source data are selected based on decision graph.Then,based on the nearest neighbor approximation principle,the mixed representation strategy is used to construct the sparse affinity sub-matrix and conduct the sparse processing,and the basic clustering results of the multi-source information data of oil and gas engineering are obtained by combining the nearest representative fast ap-proximation method.Finally,the Lagrange function is used to give weight to the results after basic clustering,calculate the clustering cost,and complete the integration of multi-source information data of oil and gas engineering.Experi-ments show that the average number of iterations of the proposed method for the dataset is low,the CA is always above 96%,the NMI value is above 0.94,and the curve is stable and fluctuates slightly,indicating that the clustering inte-gration is more accurate and effective.

K-medoids clusteringMulti-source information dataDecision diagramSparse affinity submatrixBasis clustering

高丽娟、王志伟、李明江、曲晓慧

展开 >

中海油能源发展股份有限公司工程技术分公司,天津 300450

K-medoids聚类 多源信息数据 决策图 稀疏亲和子矩阵 基聚类

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(11)