离群点检测算法综述

扫码查看

原文链接

万方数据
维普

中文摘要：离群点检测作为数据挖掘领域的一个重要研究方向,其目的是发掘隐藏在数据集合中与众不同且具有潜在分析价值的数据,辅助研究人员甄别数据源可能存在的问题.目前,离群点检测已被广泛应用于欺诈识别、智慧医疗、入侵检测、故障诊断等诸多领域.文中在总结前人经验的基础上,首先讨论离群点的定义、产生原因以及典型应用领域,综述了 DBSCAN和LOF等离群点检测经典算法及其改进算法的优势和局限,分析了深度学习方法在离群点检测领域的优势;其次结合当前互联网背景下海量、高维、时序数据处理需求,对离群点检测算法在新环境下的发展状况做进一步研究;最后介绍离群点检测算法的评价指标、代价因子在离群点检测评价中的作用以及常用工具包和数据集,总结展望了离群点检测面临的挑战和未来的发展方向.

外文标题：Review of Outlier Detection Algorithms

外文摘要：Outlier detection,as an important research direction in the field of data mining,aims to discover data points in a dataset that are different from the majority and have potential analytical value,assistresearchers in identifying potential issues in the data source.Currently,outlier detection has been widely applied in various domains such as fraud detection,smart healthcare,intrusion detection,and fault diagnosis.This study,based on summarizing previous experiences,first discusses the definition of outliers,their causes,and typical application domains.It reviews the advantages and limitations of classical outlier detection algorithms such as DBSCAN and LOF,as well as their improved algorithms.Additionally,it analyzes the advantages of deep learning me-thods in the field of outlier detection.Secondly,considering the requirements for processing massive,high-dimensional,and tempo-ral data in the current internet context,further research is conducted on the development status of outlier detection algorithms in new environments.Finally,the evaluation indicators of outlier detection algorithms,the role of cost factors in outlier detection evaluation,as well as commonly used toolkits and datasets,are introduced.The challenges and future development directions of outlier detection are summarized and prospected.

外文关键词：

OutliersAnomaly detectionDeep learningTime-series dataData mining

作者：

孔翎超、刘国柱

展开 >

作者单位：

青岛科技大学信息科学技术学院山东青岛 266061

关键词：

离群点异常检测深度学习时序数据数据挖掘

基金：

国家自然科学基金

项目编号：

61973180

出版年：

2024

DOI：

10.11896/jsjkx.230600052

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(8)

参考文献量6