Detecting Abnormal Water Extraction Data Based on Isolation Forest
In order to quickly and accurately detect the outliers of water withdrawal data of water supply enterprises,an unsupervised learning algorithm based on isolation forest was proposed.The water withdrawal data of four water sup-ply enterprises(A-D)provided by Anhui water resource intake monitoring platform was taken as an example.The data were compared with the traditional boxplot method and supervised learning k-nearest neighbor algorithm through experi-ments.The results show that the average F1 and AAUC values obtained by the unsupervised learning algorithm based iso-lation forest reach 0.963 0 and 0.998 0 respectively due to its unique tree structure,which are about 56.40% and 22.47% higher than the k-nearest neighbor algorithm,18.92% and 9.70% higher than the boxplot method,respective-ly.Although the performance of the unsupervised learning algorithm based on isolation forest was degraded when simula-ting the abnormal water intake behavior in the interval,its stability was still better than that of k-nearest neighbor algo-rithm and boxplot method,which indicates that the unsupervised learning algorithm based on isolation forest has certain advantages in the detection of abnormal data types.