首页|深度学习驱动下IaaS云运维异常检测算法的研究进展

深度学习驱动下IaaS云运维异常检测算法的研究进展

扫码查看
异常检测是IaaS云系统运维中的一个关键任务,通过早期预警和提前干预,可有效避免系统崩溃等严重事故的发生.但相较于传统数据中心,IaaS云系统具有较大规模的计算节点,节点拓扑复杂、监测数据量大、缺少标注信息等特点,为IaaS云运维异常检测带来新的挑战.从深度学习的技术框架出发,分析了异常检测问题面临的难点,调研总结了IaaS云系统下常见异常检测算法和相关技术.面向节点异常和系统异常两类典型问题,对深度学习驱动的解决方法进行调研:面向节点级别异常,重点调研了时间依赖的运维数据下由时序数据驱动的检测算法;面向系统级别异常,重点调研了网络拓扑建模下由图数据驱动的检测算法.最后,提出了数据驱动下IaaS云运维数据异常检测中的新问题与新挑战.
Research Progress of Anomaly Detection in IaaS Cloud Operation Driven by Deep Learning
Anomaly detection is an important task in the operation and maintenance of IaaS cloud systems.Through early warning and intervention,serious accidents such as system crashes can be effectively avoided.However,compared to traditional data cen-ters,IaaS cloud systemshave the characteristics of large-scale computing nodes,complex node topology,large monitoring data vo-lume,and lack of data labels,which bring new challenges for IaaS cloud anomaly detection.Starting from the technical framework of deep learning,this paper analyzes the difficulties faced by anomaly detection problems,and summarizes common anomaly detec-tion algorithms and related technologies in IaaS cloud systems.This paper investigates deep learning driven solutions for two typ-ical problems:node anomalies and system anomalies.For node anomalies,detection algorithms driven by temporal data are studied for time-dependent data.For system anomalies,detection algorithms driven by graph data in network topology modeling are inves-tigated.Finally,new issues and challenges in data-driven anomaly detection in IaaS cloud systems are proposed.

Anomaly detectionIaaS cloudTime series dataGraph dataDeep learningMachine learning

司佳、梁建峰、谢硕、邓英俊

展开 >

国家海洋信息中心 天津 300171

天津大学应用数学中心 天津 300072

异常检测 IaaS云平台 时序数据 图数据 深度学习 机器学习

国家海洋信息中心青年基金南海海洋资源利用国家重点实验室开放基金

202102006MRUKF2021035

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(z1)
  • 43