Outlier Detection Based on Local Distribution Difference of Hybrid Nearest Neighbors
Outlier detection is an important task in the field of data mining.Its purpose is to find inconsistent data from the da-ta representing events or object behaviors.At present,most traditional unsupervised outlier detection algorithms,such as methods based on distance or density,have the problem of declining detection accuracy due to the curse of dimensionality when identifying outlier data in multi-dimensional space.This paper proposes an outlier detection algorithm based on hybrid nearest neighbors.The algorithm uses the hybrid nearest neighbors of data items as a new local influence space,and redefines the similarity calculation method of data items by bidirectional sharing nearest neighbors and Euclidean distance.The average local distribution difference of the sample in its local influence space measures the local outlier degree of the data,so as to identify outliers.The experimental re-sults of comparison with other similar algorithms on synthetic and real data sets prove that this algorithm has a certain improvement in outlier detection.
unsupervisedoutlier detectionhybrid nearest neighborslocal distribution difference