在车联网中,合理分配频谱资源对满足不同车辆链路业务的服务质量(Quality of Service,QoS)需求具有重要意义.为解决车辆高速移动性和全局状态信息获取困难等问题,提出了一种基于完全分布式多智能体深度强化学习(Multi-Agent Deep Reinforcement Learning,MADRL)的资源分配算法.该算法在考虑车辆通信延迟和可靠性的情况下,通过优化频谱选择和功率分配策略来实现最大化网络吞吐量.引入共享经验池机制来解决多智能体并发学习导致的非平稳性问题.该算法基于深度Q网络(Deep Q Network,DQN),利用长短期记忆(Long Short Term Memory,LSTM)网络来捕捉和利用动态环境信息,以解决智能体的部分可观测性问题.将卷积神经网络(Convolutional Neural Network,CNN)和残差网络(Residual Network,ResNet)结合增强算法训练的准确性和预测能力.实验结果表明,所提出的算法能够满足车对基础设施(Vehicle-to-Infrastructure,V2I)链路的高吞吐量以及车对车(Vehicle-to-Vehicle,V2V)链路的低延迟要求,并且对环境变化表现出良好的适应性.
Resource Allocation for Vehicular Networking Based on Multi-agent Deep Reinforcement Learning
In vehicular networks,the rational allocation of spectrum resources is of great importance in meeting the Quality of Service(QoS)requirements for diverse vehicular link services.To address challenges such as high vehicular mobility and difficulties in obtaining global state information,a resource allocation algorithm based on fully distributed Multi-Agent Deep Reinforcement Learning(MADRL)is proposed.With vehicle communication delays and reliability taken into account,the network throughput is maximized by optimizing spectrum selection and power allocation strategies.Firstly,a shared experience pool mechanism is introduced to tackle the non-stationarity issues caused by concurrent multi-agent learning.Secondly,dynamic environmental information is captured and utilized by a Deep Q Network(DQN)built upon Long Short Term Memory(LSTM)networks,addressing the challenge of partially observable environments for agents.Finally,he training accuracy and predictive capabilities of the algorithm are enhanced by integrating Convolutional Neural Network(CNN)and Residual Network(ResNet).Experimental results demonstrate that the proposed algorithm is capable of meeting the high throughput requirements of Vehicle-to-Infrastructure(V2I)links and low latency requirements of Vehicle-to-Vehicle(V2V)links while showing adaptability to changing environments.