Adaptive Distance Based Outlier Detection Algorithm
Near-neighbour based outlier detection methods mine outlier points based on the neighbours around the data object,but this type of method is greatly affected by the threshold parameter and mostly performs well only in the case of a single data distribution.Aiming at the difficulty of outlier detection in the case of diverse data distribution and the sensitivity of threshold parameters,an adaptive distance-based outlier detection algorithm is proposed.Firstly,by dynamically adjusting the contribution factor of data attributes,the key attributes have more influence in outlier detection,which can accurately reflect the correlation between the key attributes and outliers.Secondly,the distance between data objects is calculated by comprehensively considering the contribution factor of attributes and the density,so as to better identify the positional relationship between data objects and the density distribution characteristics.Lastly,in order to reduce the threshold parameter's influence,the size of neighbours is gradually increased to calculate the sum of changes in adaptive distances of data objects,which is accumulated as the outlier score.The proposed algorithm is verified to have higher detection accuracy through experiments on synthetic datasets and public datasets.
data miningoutlier detectionattribute contribution factordensity distributionadaptive distance