首页|基于多智能体深度强化学习的车联网区分业务资源分配算法

基于多智能体深度强化学习的车联网区分业务资源分配算法

扫码查看
车联网产生大量网络连接和差异化数据,针对单个智能体难以在动态场景下收集信道状态信息并进行区分业务的资源分配和链路调度,提出了基于多智能体深度强化学习的车联网区分业务资源分配算法.该算法以实现紧急业务链路干扰最小化约束下,V2V链路数据包成功交付率和V2I链路总容量最大化为 目标,利用深度强化学习算法进行多个蜂窝用户和设备到设备用户共存的单天线车载网络中,频谱分配和功率选择的策略优化.每个智能体都利用DQN进行训练,智能体间共同与通信环境交互,通过全局奖励函数实现智能体间的协作.仿真结果表明,高负载场景下,相较于传统随机分配算法,该算法的V2I链路总吞吐量增加了3.76 Mbps,V2V链路的数据包交付率提高了17.1%,紧急业务链路所受干扰相对于普通链路减少1.42 dB,实现紧急业务链路的优先级保障,有效提高了V2I链路和V2V链路的总传输容量.
Resource allocation algorithm for distinguished services in vehicular networks based on multi-agent deep reinforcement learning
The Internet of vehicles(IoV)generates a massive amount of network connections and di-versified data.To address the challenge that a single agent struggles to collect channel state information and perform service-differentiated resource allocation and link scheduling in dynamic scenarios,a multi-agent deep reinforcement learning-based service-differentiated resource allocation method for IoV is pro-posed.This method aims to maximize the successful delivery rate of V2V link data packets and the total capacity of V2I links,under the constraint of minimizing interference to emergency service links.It em-ploys deep reinforcement learning algorithms to optimize spectrum allocation and power selection strate-gies in a single-antenna vehicle-mounted network where multiple cellular users and device-to-device users coexist.Each agent is trained using deep Q-network(DQN),and they interact with the communication environment collectively,achieving coordination through a global reward function.Simulation results show that,in high-load scenarios,compared to traditional random allocation schemes,this scheme in-creases the total throughput of V2I links by 3.76 Mbps,improves the packet delivery rate of V2V links by 17.1%,and reduces the interference to emergency service links by 1.42 dB compared to ordinary links.This achieves priority guarantee for emergency service links and effectively enhances the overall transmission capacity of V2I and V2V links.

internet of vehiclesspectrum allocationreinforcement learningmulti-agentemergency services

蔡玉、官铮、王增文、王学、杨志军

展开 >

云南大学信息学院,云南 昆明 650500

车联网 频谱分配 强化学习 多智能体 紧急业务

云南省应用基础研究计划云南省专家工作站项目云南省教育厅科研基金

202201AT070167202305AF1500452023Y0246

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(10)