面向寒冷地区住宅碳排放预测的机器学习算法模型比较

A comparative study of machine learning algorithm models for predicting carbon emissions of residential buildings in cold zones

刘依明 ¹杨珺涵 ²张忠利 ³许沛琪 ⁴刘念雄⁴

扫码查看

作者信息

1. 浙江省建设投资集团股份有限公司,杭州 310012
2. 中国建筑西南设计研究院有限公司,成都 610000
3. 临沂城市建设投资集团有限公司,临沂 276000
4. 清华大学建筑学院,北京 100084
折叠

摘要

机器学习算法模型为住宅低碳设计与优化提供了数据支持.然而,在碳排放预测与分析时,算法模型常被直接使用,而未考虑调参与寻优,且不同自变量数据集对模型预测效果的影响差异也有待明确.为揭示不同算法模型对寒冷地区住宅低碳设计的指导效果、向建筑师提供算法模型的选择依据,针对多元线性回归、分类回归树、随机森林、自适应增强算法、梯度提升回归树和多层感知机等在低碳设计中常用的算法模型进行寻优,对比分析不同算法和自变量数据集的适用性与预测性能.该文说明了算法模型寻优的目标边界、参数取值范围、寻优过程和论证方法.在37栋寒冷地区钢筋混凝土剪力墙结构住宅及其衍生方案的基础上,采用交叉验证和网格搜索,建立了 120个建材碳排放预测模型和60个将稳态耗热量转换为动态耗热量的转化系数预测模型.对比结果表明:总体上,多元线性回归、随机森林和梯度提升回归树算法的碳排放预测性能更好.其中,随机森林和梯度提升回归树算法在误差控制方面表现更佳,但预测优度与多元线性回归算法相近,且可解释性较差.采用恰当的自变量数据集,如建筑总层数、建筑层高、建筑面宽与进深等形体尺度参数,标准层户数与卧室数等功能配置参数,以及采暖期室外平均温度、实际供暖天数、屋面和墙面传热系数的修正系数等城市气象参数,多元线性回归算法能够为寒冷地区住宅低碳设计与优化提供更直观、有效的指导.

Abstract

[Objective]Machine learning algorithms provide valuable data support for designing and optimizing low-carbon residential buildings.However,when used directly for carbon emission prediction and analysis,these models often lack proper parameter tuning and optimization.The different impacts of various independent variable datasets on predictive performance also remain to be clarified.In China's cold zones,where residential buildings share similar architectural structures,energy-saving designs,and spatial layouts,carbon emissions primarily come from the operational phase and the production stages of building materials,with heating emissions being a significant component.This study aims to elucidate the effectiveness of different machine learning algorithm models in guiding low-carbon residential design in these cold zones,offering architects criteria for selecting proper algorithms.This study focuses on automatic parameter tuning and optimization for several commonly used algorithms in the context of low-carbon design of buildings,including multiple linear regression,classification and regression tree,random forest,adaptive boosting,gradient boosting regression tree,and multilayer perceptron.The study compares and analyzes the performance limits and applicability of these algorithms and independent variable datasets in predicting carbon emissions during building material production and heating stages.[Methods]This paper elaborates on the target boundaries,parameter ranges,optimization processes,and validation methods for optimizing machine learning algorithm models.Through comprehensive research and simulation analysis of 37 reinforced concrete shear wall residential buildings and their derivative schemes in cold zones,multiple independent variable datasets suitable for establishing predictive models are identified.Cross-validation and grid search techniques are employed to optimize the predictive performance limits of different machine learning algorithms and independent variable datasets.Subsequently,120 models for predicting carbon emissions from building materials and 60 models for transforming steady-state heating consumption into dynamic heating consumption using the six mentioned algorithms are established.[Results]A horizontal comparison of the models reveals that algorithms such as multiple linear regression,random forest,and gradient boosting regression trees exhibit relatively good performance(R2 over 0.900)in carbon emission prediction after hyperparameter tuning across different independent variable datasets.Random forest and gradient boosting regression tree models excel in error control and offer similar predictive accuracy to multiple linear regression but lack interpretability.In contrast,multiple linear regression models provide clearer equations and stronger guidance for low-carbon design and optimization,focusing on carbon emission reduction during building material production or winter heating stages.Models based on the total residential building area exhibit optimal performance in predicting building material carbon emissions.Predictive models built on parameters such as the number of above-ground and underground floors,building width and depth,total household numbers,number of bedrooms for standard floor,and total number of residential bathrooms in the residence also demonstrate strong predictive capabilities for building material carbon emissions.For predicting the conversion coefficient during the heating stage,including the number of households and bedrooms per standard floor as independent variables significantly enhances predictive performance.[Conclusions]Although various machine learning models are useful for predicting residential building carbon emissions,the multiple linear regression model stands out owing to its excellent predictive performance and its intuitive representation of how design parameters affect carbon emissions.By utilizing different and appropriate independent variable datasets,such as the total number of floors,floor height,building dimensions,number of households and bedrooms on a floor,and corrected coefficients for urban meteorological parameters(including outdoor average temperature during the heating season,actual heating days,and roof and wall heat transfer coefficients),or by adopting the finally determined total building area,the multiple linear regression algorithm can deliver timely and multi-faceted guidance.These results are crucial for low-carbon design and optimization during the primary stages of the residential lifecycle in China,s cold zones.

关键词

碳排放/机器学习/设计参数/交叉验证/网格搜索

Key words

carbon emission/machine learning/design parameters/cross-validation/grid search

引用本文复制引用

基金项目

浙江省科技计划项目(2023C03173)

国家重点研发计划项目(2022YFC3803800)

国家自然科学基金重点项目(52130803)

出版年

2024

清华大学学报(自然科学版)

清华大学

清华大学学报(自然科学版)

CSTPCD北大核心

影响因子：0.586

ISSN：1000-0054

参考文献量34

段落导航