为了保证6G网络场景下用户的服务质量(quality of service,QoS)时延以及解决深度强化学习(deep rein-forcement learning,DRL)收敛时间较长的问题,对云-边-端架构下的计算网络进行了研究.提出了多评论家深度强化学习框架,在此基础上提出知识嵌入多评论家深度强化学习算法,将无线通信知识嵌入深度强化学习,采取深度强化学习与计算网络结合的方式对网络中的计算资源和频谱资源进行分配.仿真结果表明,所提出的方法相比于传统的深度强化学习方法能够有效减少收敛时间,并且能够在信道时变的环境下,保证用户时延的基础上能够实现实时决策.
Knowledge embedding deep reinforcement learning for 6G network decision making algorithm
In this paper,a computing network based on cloud-edge-device architecture is studied to ensure QoS delay for users in 6G networks and address the long convergence in deep reinforcement learning.A multi-critic deep reinforcement learning framework is proposed,and on this basis,a knowledge embedding multi-critic deep reinforcement learning algo-rithm is proposed.The wireless communication knowledge is embedded into deep reinforcement learning,and the combina-tion of deep reinforcement learning and computing network is adopted to allocate computing resources and spectrum re-sources in the network.Simulation results show that the proposed method can effectively reduce the convergence time com-pared to traditional deep reinforcement learning methods,and can achieve real-time decision-making based on user delay in the channel time-varying environment.