首页|An irrelevant attributes resistant approach to anomaly detection in high-dimensional space using a deep hypersphere structure
An irrelevant attributes resistant approach to anomaly detection in high-dimensional space using a deep hypersphere structure
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NSTL
Elsevier
It is a grand challenge to detect anomalies existing in subspaces from a high-dimensional space. Most existing state-of-the-art methods implicitly or explicitly rely on distances. Since the contrast, e.g., distances, between data objects in a high-dimensional space becomes more and more similar. Moreover, high-dimensional spaces may include many irrelevant attributes masking anomalies (if the prior probability for a class remains unchanged regardless of the value observed for attribute att, att is said to be irrelevant to a class, i.e., att is an irrelevant attribute). Obviously, anomalies can exist in any of subspaces, so it is difficult to select subspaces that highlight the relevant attributes in an exponential searching space. To address this issue, we proposed a hybrid method consisting of a deep network and a hypersphere to detect anomalies. The deep network in the proposed method is used as a feature extractor to capture the low-dimensional features from the background space. Then, anomalies are separated by using the hypersphere in the feature space reconstructed by probability distribution. To prevent irrelevant attributes from being mistaken for anomalies during mining anomalies, the upper of the number of anomalies is estimated by the Chebyshev theorem. Finally, the proposed method was verified on synthetic datasets and real-world datasets. Experimental results show that the proposed method outperforms the existing state-of-the-art detection methods in regard to the accuracy of mining anomalies and the ability of noise resistance. We find that feature extractors can improve the ability of noise resistance for anomalous detection methods. In the feature space reconstructed by probability distribution, anomalous features are easily identified from irrelevant features and normal features. We also indicate that irrelevant attributes increase the complexity of the feature space, through calculating the probability distribution of data in the background space, the layered features can be extracted to distinguish anomaly classes, normal classes, and irrelevant attribute classes.