首页|一种改进的双深度Q网络服务功能链部署算法

一种改进的双深度Q网络服务功能链部署算法

扫码查看
网络功能虚拟化已成为未来通信网络的关键技术,动态服务功能链的高效部署是提高网络性能迫切需要解决的问题之一.为降低通信网络服务器能量消耗以及改善通信网络服务质量,提出一种改进的双深度Q网络的动态服务功能链部署算法.由于网络状态及服务功能链的动态性,首先将服务功能链部署问题建模为马尔可夫决策过程.根据通信网络中资源的状态以及所选择的动作计算奖励函数值,对双深度Q网络进行在线训练,得到最优深度神经网络模型,从而确定最优的在线服务功能链部署策略.为解决传统深度强化学习从经验回放池中采用均匀抽取经验样本而导致神经网络学习效率低的问题,设计一种基于重要性采样的优先级经验回放方法以抽取经验样本,从而有效地避免训练样本之间的高度相关性,进一步提高离线学习神经网络的效率.仿真结果表明,所提出基于改进双深度Q网络的服务功能链部署算法能够提高奖励值,与传统的双深度Q网络算法相比,在能量消耗与阻塞率方面分别降低约19.89%~36.99%与 9.52%~16.37%.
Improved double deep Q network algorithm for service function chain deployment
Network Function Virtualization(NFV)has become the key technology of next generation communication.Virtual Network Function Service Chain(VNF-SC)mapping is the key issue of the NFV.To reduce the energy consumption of the communication network server and improve the quality of service,a Function Chain(SFC)deployment algorithm based on an improved Double Deep Q Network(DDQN)is proposed to reduce the energy consumption of network servers and improve the network quality of service.Due to the dynamic change of the network state,the service function chain deployment problem is modeled as a Markov Decision Process(MDP).Based on the network state and action rewards,the DDQN is trained online to obtain the optimal deployment strategy for the service function chain.To solve the problem that traditional deep reinforcement learning draws experience samples uniformly from the experience replay pool leading to low learning efficiency of the neural network,a prioritized experience replay method based on importance sampling is designed to draw experience samples so as to avoid high correlation between training samples to improve the learning efficiency of the neural network.Experimental results show that the proposed SFC deployment algorithm based on the improved DDQN can increase the reward value,and that compared with the traditional DDQN algorithm,it can reduce the energy consumption and blocking rate by 19.89%~36.99%and 9.52%~16.37%,respectively.

service function chainMarkov decision processnetwork energy consumptionDDQN

刘道华、魏丁二、宣贺君、余长鸣、寇丽博

展开 >

信阳师范大学计算机与信息技术学院,河南信阳 464000

服务功能链 马尔科夫决策过程 网络能耗 双深度Q网络

国家自然科学基金河南省科技攻关项目河南省本科高校研究性教学改革项目河南省高等学校重点科研项目河南省研究生教育改革与质量提升工程项目

615724172221022102652022SYJXLX06122A520007YJS2024AL104

2024

西安电子科技大学学报(自然科学版)
西安电子科技大学

西安电子科技大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.837
ISSN:1001-2400
年,卷(期):2024.51(1)
  • 15