A Outlier Factor Detection Algorithm Based on MapReduce Framework
The purpose of outlier detection is detecting data objects that are significantly different from most other objects.In recent years,in some group computing application scenarios,the amount of data is very large,and the Euclidean distance calculation cost used by the LOF algorithm to calculate local distances is constantly increasing.There are two challenging problems:1)The calculation time is long with high cost because of the number of data objects between groups;2)The dimensions of data objects are gradually increasing,and the algorithm has a time cost.The MapReduce computing framework is combined with the LOF algorithm to solve the above problems.The experiments have shown that the improved algorithm incorporating the MapReduce distributed computing framework can effectively improve the efficiency of detecting outliers in massive data.