中国科学:信息科学(英文版)2024,Vol.67Issue(4) :17-33.DOI:10.1007/s11432-023-3982-y

Online Pareto optimal control of mean-field stochastic multi-player systems using policy iteration

Xiushan JIANG Yanshuang WANG Dongya ZHAO Ling SHI
中国科学:信息科学(英文版)2024,Vol.67Issue(4) :17-33.DOI:10.1007/s11432-023-3982-y

Online Pareto optimal control of mean-field stochastic multi-player systems using policy iteration

Xiushan JIANG 1Yanshuang WANG 1Dongya ZHAO 1Ling SHI2
扫码查看

作者信息

  • 1. College of New Energy,China University of Petroleum(East China),Qingdao 266580,China
  • 2. Department of Electronic and Computer Engineering,Hong Kong University of Science and Technology,Hong Kong 999077,China
  • 折叠

Abstract

In this study,the Pareto optimal strategy problem was investigated for multi-player mean-field stochastic systems governed by Itô differential equations using the reinforcement learning(RL)method.A partially model-free solution for Pareto-optimal control was derived.First,by applying the convexity of cost functions,the Pareto optimal control problem was solved using a weighted-sum optimal control problem.Subsequently,using on-policy RL,we present a novel policy iteration(PI)algorithm based on the H-representation technique.In particular,by alternating between the policy evaluation and policy update steps,the Pareto optimal control policy is obtained when no further improvement occurs in system performance,which eliminates directly solving complicated cross-coupled generalized algebraic Riccati equations(GAREs).Practical numerical examples are presented to demonstrate the effectiveness of the proposed algorithm.

Key words

mean-field stochastic systems/Pareto optimal control/policy iteration scheme/H-representation

引用本文复制引用

基金项目

国家自然科学基金(62103442)

国家自然科学基金(12326343)

国家自然科学基金(62373229)

山东省自然科学基金(ZR2021QF080)

中央高校基本科研业务费专项(23CX06024A)

Outstanding Youth Innovation Team in Shandong Higher Education Institutions(2023KJ061)

出版年

2024
中国科学:信息科学(英文版)
中国科学院

中国科学:信息科学(英文版)

CSTPCDEI
影响因子:0.715
ISSN:1674-733X
参考文献量33
段落导航相关论文