净水技术2024,Vol.43Issue(12) :96-102,110.DOI:10.15890/j.cnki.jsjs.2024.12.011

孤立森林算法优化的XGBoost对养殖废水处理水质预测建模

Modeling Water Quality Prediction for Aquaculture Waste water Treatment by Isolation Forest Algorithm-Optimized XGBoost

邓志成 万金泉 王艳 朱斌 吴昌政 吉世明
净水技术2024,Vol.43Issue(12) :96-102,110.DOI:10.15890/j.cnki.jsjs.2024.12.011

孤立森林算法优化的XGBoost对养殖废水处理水质预测建模

Modeling Water Quality Prediction for Aquaculture Waste water Treatment by Isolation Forest Algorithm-Optimized XGBoost

邓志成 1万金泉 1王艳 1朱斌 2吴昌政 1吉世明2
扫码查看

作者信息

  • 1. 华南理工大学环境与能源学院,广东 广州 510006
  • 2. 广东顺控自华科技有限公司,广东佛山 528300
  • 折叠

摘要

为解决水质软测量过程中数据失真问题,研究采用孤立森林(isolation forest,IF)算法对水质传感器在线监测数据进行异常值处理,使用递归特征消除(recursive feature elimination,RFE)优化模型变量选择,采用XGBoost算法构建水质预测模型,用于预测经处理后养殖鱼塘尾水出水化学需氧量(CODCr)、总氮(TN)和总磷(TP).试验表明,XGBoost算法构建的生物净化池CODCr、TN和TP水质预测模型具有良好的预测性能,各模型决定系数(R2)分别达到了 0.837、0.804和0.878,平均绝对误差(mean absolute error,MAE)分别为 0.679、0.087 和 0.036,均方根误差(root mean square error,RMSE)分别为 0.700、0.105和0.044.同时,使用IF算法对采集到的数据进行异常值识别与剔除后,模型的R2提升至0.875、0.866和0.926,MAE降低至0.658、0.077和0.028,RMSE降低至0.681、0.099和0.035.研究对于发展水质智能软测量技术具有重要的指导价值.

Abstract

In order to solve the data distortion problem in the process of water quality soft measurement,this study adopted the isolated forest(IF)algorithm to process the abnormal value in the online monitoring data of water quality sensors,optimized the selection of model variables using recursive feature elimination(RFE).XGBoost algorithm was used to construct the water quality prediction model for predicting chemical oxygen demand(CODCr),total phosphorus(TP)and total nitrogen(TN)in the tailwater effluent of the treated farmed fish ponds.The experiments showed that the water quality prediction model for CODCr,TN and TP of the bio-purification pond constructed by the XGBoost algorithm had good prediction performance,and the coefficient of determination(R2)of each model reached 0.837,0.804 and 0.878,respectively,the MAE was 0.679,0.087 and 0.036,and the RMSE was 0.700,0.105 and 0.044,respectively.Meanwhile,after using IF algorithm to identify and remove the outliers of the collected data,the R2 of the model was improved to 0.875,0.866 and 0.926,the MAE decreased to 0.658,0.077 and 0.028,and the RMSE decreased to 0.681,0.099 and 0.035.This study has an important guiding value for the development of intelligent soft sensing technology of water quality.

关键词

机器学习/孤立森林/异常值检测/养殖尾水/水质预测

Key words

machine learning/isolation forest/abnormal value detection/aquaculture wastewater/water quality prediction

引用本文复制引用

出版年

2024
净水技术
上海市净水技术学会,上海市城乡建设和交通委员会科学技术委员会办公室

净水技术

CSTPCD
影响因子:0.643
ISSN:1009-0177
段落导航相关论文