火力与指挥控制2024,Vol.49Issue(11) :128-132.DOI:10.3969/j.issn.1002-0640.2024.11.017

结合MapReduce框架的离群因子检测算法

A Outlier Factor Detection Algorithm Based on MapReduce Framework

徐树奎 张煜 李海霞 常海艳 张和伟
火力与指挥控制2024,Vol.49Issue(11) :128-132.DOI:10.3969/j.issn.1002-0640.2024.11.017

结合MapReduce框架的离群因子检测算法

A Outlier Factor Detection Algorithm Based on MapReduce Framework

徐树奎 1张煜 2李海霞 2常海艳 2张和伟2
扫码查看

作者信息

  • 1. 解放军31451部队,沈阳 110000
  • 2. 北方自动控制技术研究所,太原 030006
  • 折叠

摘要

离群因子检测目的是检测与大部分其他对象显著不同的数据对象.近年来,在某些分组计算应用场景下,数据量十分巨大,现有算法采用的欧氏距离计算开销不断增大,存在两个较有挑战性问题:1)组间数据对象数量十分巨大,计算耗时较长,开销较大;2)数据对象维度逐渐增多,算法时间开销大.将MapReduce计算框架与LOF算法相结合,解决上述问题.实验证明,引入了MapReduce分布式计算框架的改进算法可以有效提升大量数据中检测离群点的效率.

Abstract

The purpose of outlier detection is detecting data objects that are significantly different from most other objects.In recent years,in some group computing application scenarios,the amount of data is very large,and the Euclidean distance calculation cost used by the LOF algorithm to calculate local distances is constantly increasing.There are two challenging problems:1)The calculation time is long with high cost because of the number of data objects between groups;2)The dimensions of data objects are gradually increasing,and the algorithm has a time cost.The MapReduce computing framework is combined with the LOF algorithm to solve the above problems.The experiments have shown that the improved algorithm incorporating the MapReduce distributed computing framework can effectively improve the efficiency of detecting outliers in massive data.

关键词

离群因子检测/LOF算法/MapReduce框架/分布式计算

Key words

outlier factor detection/LOF algorithm/MapReduce framework/distributed computing

引用本文复制引用

出版年

2024
火力与指挥控制
火力与指挥控制研究会,火力与指挥控制专业情报网

火力与指挥控制

CSTPCDCSCD北大核心
影响因子:0.312
ISSN:1002-0640
段落导航相关论文