首页|Combining KNN with AutoEncoder for Outlier Detection

Combining KNN with AutoEncoder for Outlier Detection

扫码查看
K-nearest neighbor(KNN)is one of the most fundamental methods for unsupervised outlier detection be-cause of its various advantages,e.g.,ease of use and relatively high accuracy.Currently,most data analytic tasks need to deal with high-dimensional data,and the KNN-based methods often fail due to"the curse of dimensionality".AutoEn-coder-based methods have recently been introduced to use reconstruction errors for outlier detection on high-dimensional data,but the direct use of AutoEncoder typically does not preserve the data proximity relationships well for outlier detec-tion.In this study,we propose to combine KNN with AutoEncoder for outlier detection.First,we propose the Nearest Neighbor AutoEncoder(NNAE)by persevering the original data proximity in a much lower dimension that is more suit-able for performing KNN.Second,we propose the K-nearest reconstruction neighbors(KNRNs)by incorporating the re-construction errors of NNAE with the K-distances of KNN to detect outliers.Third,we develop a method to automatical-ly choose better parameters for optimizing the structure of NNAE.Finally,using five real-world datasets,we experimen-tally show that our proposed approach NNAE+KNRN is much better than existing methods,i.e.,KNN,Isolation Forest,a traditional AutoEncoder using reconstruction errors(AutoEncoder-RE),and Robust AutoEncoder.

outlier detectionAutoEncoderK-nearest neighbor(KNN)unsupervised learning

刘叔正、马帅、陈瀚清、崔立真、丁杰

展开 >

State Key Laboratory of Software Development Environment,Beihang University,Beijing 100191,China

School of Software,Shandong University,Jinan 250100,China

Joint SDU-NTU Centre for Artificial Intelligence Research,Shandong University,Jinan 250100,China

School of Computer Science,Jiangsu University of Science and Technology,Zhenjiang 212003,China

展开 >

2024

计算机科学技术学报(英文版)
中国计算机学会

计算机科学技术学报(英文版)

CSTPCD
影响因子:0.432
ISSN:1000-9000
年,卷(期):2024.39(5)