系统仿真学报2024,Vol.36Issue(6) :1425-1432.DOI:10.16182/j.issn1004731x.joss.23-0137

基于PPO的自适应PID控制算法研究

Adaptive PID Control Algorithm Based on PPO

周志勇 莫非 赵凯 郝云波 钱宇峰
系统仿真学报2024,Vol.36Issue(6) :1425-1432.DOI:10.16182/j.issn1004731x.joss.23-0137

基于PPO的自适应PID控制算法研究

Adaptive PID Control Algorithm Based on PPO

周志勇 1莫非 1赵凯 2郝云波 2钱宇峰1
扫码查看

作者信息

  • 1. 上海电机学院机械学院,上海 201306
  • 2. 上海航天设备制造总厂有限公司,上海 200245
  • 折叠

摘要

采用MATLAB物理引擎联合Python搭建了一个六轴机械臂,并模拟带有扰动的复杂控制环境,为机械臂训练提供现实中无法提供的试错环境.使用强化学习中近端优化算法(proximal policy optimization,PPO)算法对传统PID控制算法进行改进,引入多智能体思想,根据PID三个参数对控制系统的不同影响及六轴机械臂的特性,将三个参数分别作为不同的智能个体进行训练,实现多智能体自适应调整参数的新型多智能体自适应PID算法.仿真结果表明:该算法的训练收敛性优于MA-DDPG与MA-SAC算法,与传统PID算法的控制效果相比,在遇到扰动及振荡的情况下,能够更有效地抑制振荡,并具有更低的超调量和调整时间,控制过程更为平缓,有效提高了机械臂的控制精度,证明了该算法的鲁棒性及有效性.

Abstract

A six-axis robotic arm is built and simulated in a complex control environment with disturbances by using MATLAB physics engine and Python,which provides a trial-and-error environment for the robotic arm training that could not be provided in reality.Proximal policy optimization(PPO)algorithm in reinforcement learning is proposed to improve the traditional PID control algorithm.By introducing the multi-agent idea and on the basis of the different effects of the three parameters of PID on control system and the characteristics of the six-axis robotic arm,the three parameters are separately trained as different intelligent individuals to achieve a new multi-agent adaptive PID algorithm with multi-agent adaptive adjustment of parameters.Simulation results show that the algorithm outperforms MA-DDPG and MA-SAC algorithms in training convergence.Compared with the traditional PID algorithm,the algorithm can effectively suppress the disturbances and oscillations,and has lower overshoot and adjustment time,which makes the control process smoother and effectively improves the control accuracy of the robotic arm.The robustness and effectiveness is proved.

关键词

强化学习/近端优化算法/自适应PID整定/机械臂/多智能体

Key words

RL/PPO algorithm/adaptive PID tuning/robotic arm/multi-agent

引用本文复制引用

基金项目

上海市闵行区重大产业技术攻关计划(2022MH-ZD20)

出版年

2024
系统仿真学报
北京仿真中心 中国系统仿真学会

系统仿真学报

CSTPCDCSCD北大核心
影响因子:0.551
ISSN:1004-731X
参考文献量4
段落导航相关论文