中国科学:信息科学(英文版)2024,Vol.67Issue(4) :127-141.DOI:10.1007/s11432-023-3853-y

Robust cooperative multi-agent reinforcement learning via multi-view message certification

Lei YUAN Tao JIANG Lihe LI Feng CHEN Zongzhang ZHANG Yang YU
中国科学:信息科学(英文版)2024,Vol.67Issue(4) :127-141.DOI:10.1007/s11432-023-3853-y

Robust cooperative multi-agent reinforcement learning via multi-view message certification

Lei YUAN 1Tao JIANG 2Lihe LI 2Feng CHEN 2Zongzhang ZHANG 2Yang YU1
扫码查看

作者信息

  • 1. National Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China;Polixir Technologies,Nanjing 211106,China
  • 2. National Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China
  • 折叠

Abstract

Many multi-agent scenarios require message sharing among agents to promote coordination,hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.Major relevant studies tackle this issue under specific assumptions,like a limited number of message channels would sustain perturbations,limiting the efficiency in complex scenarios.In this paper,we take a further step in addressing this issue by learning a robust cooperative multi-agent reinforcement learning via multi-view message certification,dubbed CroMAC.Agents trained under CroMAC can obtain guaranteed lower bounds on state-action values to identify and choose the optimal action under a worst-case deviation when the received messages are perturbed.Concretely,we first model multi-agent communication as a multi-view problem,where every message stands for a view of the state.Then we extract a certificated joint message representation by a multi-view variational autoencoder(MVAE)that uses a product-of-experts inference network.For the optimization phase,we do perturbations in the latent space of the state for a certificate guarantee.Then the learned joint message representation is used to approximate the certificated state representation during training.Extensive experiments in several cooperative multi-agent benchmarks validate the effectiveness of the proposed CroMAC.

Key words

multi-agent reinforcement learning/robust communication/adversarial training/multi-view learning/message certification

引用本文复制引用

基金项目

国家重点研发计划(2020AAA0107200)

国家自然科学基金(61921006)

国家自然科学基金(61876119)

国家自然科学基金(62276126)

江苏省自然科学基金(BK20221442)

Program B for Outstanding PhD Candidate of Nanjing University()

出版年

2024
中国科学:信息科学(英文版)
中国科学院

中国科学:信息科学(英文版)

CSTPCDEI
影响因子:0.715
ISSN:1674-733X
参考文献量93
段落导航相关论文