The optimization of the number of central air-conditioning cooling source units and their operating parameters is a collaborative optimization problem involving both discrete and continuous variables,which poses challenges for classical reinforcement learning algorithms.To address this problem,this paper proposed an energy-saving optimiza-tion control strategy for central air-conditioning cooling source systems based on a combination of the options-critic and actor-critic frameworks.Firstly,a hierarchical actor-critic(H-AC)algorithm was utilized to hierarchically opti-mize the number of units and operating parameters,with both the high-level and low-level models sharing a Q-network to evaluate state values,thereby addressing optimization challenges across multiple time scales.Secondly,the H-AC algorithm was improved in terms of agent architecture,policy,and network update mechanisms to accelerate the con-vergence of the agent.Finally,the proposed method was validated on the cooling source system of a research building located in a hot summer and warm winter region,using a TRNSYS simulation platform for experiments.The results demonstrate that,under conditions where the average indoor comfort time proportion is increased by 14.08,11.23,29.70 and 9.07 percentage points,respectively,the system energy consumption based on the improved H-AC algo-rithm is reduced by 32.28%,28.55%,28.64%,and 11.53%compared to four classical DRL algorithms.Although the system energy consumption of the improved H-AC algorithm is 0.27%higher than that of the options-critic frame-work,it achieves a more stable learning process and increases the average indoor comfort time proportion by 4.8%.This approach offers effective technical solutions for energy-saving optimization of central air-conditioning cold source systems in various building types,contributing to the achievement of buildings'dual-carbon goals.