Combining KNN with AutoEncoder for Outlier Detection

扫码查看

原文链接

万方数据
维普

外文摘要：K-nearest neighbor(KNN)is one of the most fundamental methods for unsupervised outlier detection be-cause of its various advantages,e.g.,ease of use and relatively high accuracy.Currently,most data analytic tasks need to deal with high-dimensional data,and the KNN-based methods often fail due to"the curse of dimensionality".AutoEn-coder-based methods have recently been introduced to use reconstruction errors for outlier detection on high-dimensional data,but the direct use of AutoEncoder typically does not preserve the data proximity relationships well for outlier detec-tion.In this study,we propose to combine KNN with AutoEncoder for outlier detection.First,we propose the Nearest Neighbor AutoEncoder(NNAE)by persevering the original data proximity in a much lower dimension that is more suit-able for performing KNN.Second,we propose the K-nearest reconstruction neighbors(KNRNs)by incorporating the re-construction errors of NNAE with the K-distances of KNN to detect outliers.Third,we develop a method to automatical-ly choose better parameters for optimizing the structure of NNAE.Finally,using five real-world datasets,we experimen-tally show that our proposed approach NNAE+KNRN is much better than existing methods,i.e.,KNN,Isolation Forest,a traditional AutoEncoder using reconstruction errors(AutoEncoder-RE),and Robust AutoEncoder.

外文关键词：

outlier detectionAutoEncoderK-nearest neighbor(KNN)unsupervised learning

作者：

刘叔正、马帅、陈瀚清、崔立真、丁杰

展开 >

作者单位：

State Key Laboratory of Software Development Environment,Beihang University,Beijing 100191,China

School of Software,Shandong University,Jinan 250100,China

Joint SDU-NTU Centre for Artificial Intelligence Research,Shandong University,Jinan 250100,China

School of Computer Science,Jiangsu University of Science and Technology,Zhenjiang 212003,China

展开 >

出版年：

2024

DOI：

10.1007/s11390-023-2403-y

计算机科学技术学报(英文版)

中国计算机学会

计算机科学技术学报(英文版)

CSTPCD

影响因子：0.432

ISSN：1000-9000

年,卷(期)：2024.39(5)