首页|基于自编码标准流的异常点检测

基于自编码标准流的异常点检测

扫码查看
在大型和高维数据上进行有效检测,在实际应用中具有重要意义.异常点检测是指识别出偏离一般数据分布的数据点,其核心是密度估计.尽管像深度自编码高斯混合模型通过先降低维度,再进行密度估计已经取得了重大进展,但是它对低维潜在空间引入噪声,并且在对密度估计模块优化时存在一些限制,例如需要保证协方差是正定矩阵.为解决这些限制,本文提出一种用于无监督异常检测的深度自编码标准化流(deep autoencoder normalizing flow,DANF).该模型利用深度自编码器为每个输入样本生成低维潜在空间表示和重构误差,进而将其输入标准化流(normalizing flow,NF),最终映射成高斯分布.在多个公开的基准数据集上的实验结果表明,深度自编码标准化流模型显著优于最先进的异常检测技术,在评估指标F1-score上最高提升26.43%.
Outlier Detection Based on Autoencoder Normalizing Flow
Detecting outliers is crucial for practical applications in large and high-dimensional datasets.Outlier detection is the process of identifying data points that deviate from the typical data distribution.This process primarily involves density estimation.Substantial advancements are achieved by models like the deep autoencoder Gaussian mixture model,which initially reduces dimensionality and subsequently estimates density.However,it introduces noise into the low-dimensional latent space and faces limitations in optimizing the density estimation module,such as the requirement to ensure positive definiteness of the covariance matrix.To overcome these constraints,this study introduces the deep autoencoder normalizing flow(DANF)for unsupervised outlier detection.The model employs deep autoencoders to produce low-dimensional latent space representations and reconstruction errors for individual input samples.These outputs are subsequently fed into a normalizing flow(NF)for transformation into a Gaussian distribution.Experimental results on several widely recognized benchmark datasets reveal that the DANF model consistently surpasses state-of-the-art outlier detection methods.The most notable improvement is a remarkable 26.43%increase in the F1-score evaluation metric.

outlier detectionunsupervised learningnormalizing flow(NF)invertible transformdensity estimation

钟海鑫、王晖、郭躬德

展开 >

福建师范大学计算机与网络空间安全学院,福州 350117

贝尔法斯特女王大学电子学、电气工程与计算机科学学院,贝尔法斯特BT9 5BN

异常检测 无监督学习 标准化流 可逆变换 密度估计

国家自然科学基金国家自然科学基金

6197605362171131

2024

计算机系统应用
中国科学院软件研究所

计算机系统应用

CSTPCD
影响因子:0.449
ISSN:1003-3254
年,卷(期):2024.33(3)
  • 38