首页|基于机器学习的PPCPs固-液分配系数预测

基于机器学习的PPCPs固-液分配系数预测

扫码查看
近年来药品和个人护理产品(PPCPs)作为新兴污染物越来越受到重视,研究PPCPs在固相环境介质中的固-液分配系数(Kd)对于了解PPCPs的归趋和评价其环境风险至关重要,然而基于线性分配的传统方法不确定性较高.本研究收集了 24种常见PPCPs的吸附批量实验数据,包括Kd、土壤性质、实验参数和化合物分子描述符,构建数据集,并采用机器学习构建Kd的预测模型.结果表明,随机森林(RF)和极端梯度提升(XGBoost)2种回归模型的预测效果相似且优于支持向量回归(SVR);SHAP分析揭示了辛醇-水分配系数(logKOW)、物质的量折射率(MR)、物质的量质量(MW)、固-液比(RATIO)、有机碳含量(OC)对Kd影响最显著;利用文献报道的广州市溪流河12种PPCPs和42种沉积物样本的实测数据进行应用域分析和模型验证,结果显示,除了红霉素和罗红霉素,本研究构建的模型能很好地预测其余PPCPs的Kd值.同时,研究发现,对于在弱酸性和弱碱性条件下溶解性会发生显著增加的化合物,如环丙沙星、氧氟沙星、磺胺二甲嘧啶等,在弱酸性和弱碱性的实际环境中应用本研究所构建的方法会低估实际Kd值.
Prediction of PPCPs Solid-Liquid Partition Coefficient Based on Machine Learning
In recent years,increasing significance has been attached to pharmaceuticals and personal care products(PPCPs).Studying the solid-liquid partition coefficient(Kd)of PPCPs in solid environmental media is crucial for understanding their fate and assessing their environmental risks.However,traditional methods based on linear parti-tioning have limitations in terms of stability and accuracy.This study collected adsorption batch experimental data for 24 common PPCPs,including Kd,soil properties,experimental parameters,and compound molecular descriptors to construct a dataset,and employed machine learning to build a predictive model for Kd.The results indicated that the predictive performance of both Random Forest(RF)and Extreme Gradient Boosting(XGBoost)regression models was similar and superior to that of Support Vector Regression(SVR),Furthermore,as SHAP analysis revealed,the octanol-water partition coefficient(logKOW),molar refractivity(MR),molecular weight(MW),solid-liquid ratio(RATIO),and organic carbon content(OC)had the most significant impact on Kd.Application domain analysis and model validation using reported data on 12 PPCPs and 42 sediment samples from streams and rivers in Guangzhou City showed that,except for erythromycin and roxithromycin,the models constructed in this study could accurately predict the Kd values for the remaining PPCPs.Additionally,our research found that for com-pounds such as ciprofloxacin,ofloxacin and sulfamethazine,whose solubility significantly increases under weakly acidic and weakly alkaline conditions,the method developed in this study may underestimate the actual Kd values in weakly acidic and weakly alkaline environments.

batch adsorptionenvironmental risk assessmentmolecular descriptorsrandom forestorganic carbon adsorption coefficient

张子衡、王美娥、马万凯、陈卫平

展开 >

郑州大学河南先进技术研究院,郑州 450003

中国科学院生态环境研究中心城市与区域生态国家重点实验室,北京 100085

北京师范大学水科学研究院,北京 100875

批量吸附 环境风险评估 分子描述符 随机森林 有机碳吸附系数

国家重点研发计划项目国家重点研发计划项目

2021YFC18091032022YFC3704804

2024

生态毒理学报
中国科学院生态环境研究中心

生态毒理学报

CSTPCD北大核心
影响因子:0.857
ISSN:1673-5897
年,卷(期):2024.19(3)
  • 3