首页|一种基于加权概率密度的上下文离群检测算法

一种基于加权概率密度的上下文离群检测算法

扫码查看
采用加权概率密度,提出一种上下文离群数据检测算法。利用高斯混合模型和稀疏度矩阵,确定相关子空间;在相关子空间中,采用加权概率密度局部异常因子公式,计算数据对象的离群因子,可以有效反映和刻画数据对象与其周围数据对象的不一致程度;选取离群因子最大的N个数据对象为离群数据,并将离群因子、相关子空间属性取值、局部数据集作为其上下文信息,有效地改善了离群数据的可解释性;采用人工和UCI数据集,实验验证了算法的有效性。
A CONTEXTUAL OUTLIER DETECTION ALGORITHM BASED ON WEIGHTED PROBABILITY DENSITY
A contextual outlier data detection algorithm is proposed by using weighted probability density.In the algorithm,the Gaussian mixture model and the sparsity matrix were used to determine the correlation subspace.The weighted probability density local anomaly factor formula was used to calculate the outlier factor of the data object in the relevant subspace,which could effectively reflect and describe the degree of inconsistency between data objects and their surrounding data objects.N data objects with the largest outlier factor value were selected as outliers,and the value of outlier factor,correlation subspace attributes and local data sets were taken as their contextual information,effectively improving the interpretability and understandability of outlier data objects.Experimental results validate the effectiveness of this algorithm by using artificial data set and UCI data sets.

Outlier detectionCorrelation subspaceWeighted probability densityContextual information

白慧、张继福

展开 >

太原科技大学计算机科学与技术学院 山西 太原 030024

离群检测 相关子空间 加权概率密度 上下文信息

国家自然科学基金项目

61876122

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(2)
  • 2