一种用于机器人电池电量预测的Sarsa强化学习混合集成方法
A Sarsa reinforcement learning hybrid ensemble method for robotic battery power forecasting
彭飞 1刘辉 2郑力3
作者信息
- 1. Department of Industrial Engineering,Tsinghua University,Beijing 100084,China;CRRC Academy Co.,Ltd.,Beijing 100070,China
- 2. Institute of Artificial Intelligence&Robotics(IAIR),Key Laboratory of Traffic Safety on Track of Ministry of Education,School of Traffic and Transportation Engineering,Central South University,Changsha 410075,China
- 3. Department of Industrial Engineering,Tsinghua University,Beijing 100084,China
- 折叠
摘要
建设数据高效互联的轨道交通车间已成为当前轨道交通装备行业转型发展的必然趋势.越来越多样化的移动运输机器人设备成为智能工厂数字化转型过程中的关键.准确预测机器人的电池电量可以指导控制中心提前采取科学合理的指令,确保物流运输链高效稳定运行.在本研究中,我们提出了一种基于状态-动作-奖励-状态-动作(Sarsa)强化学习算法的多学习器混合集成方法.首先,采用最大重叠离散小波变换(MODWT)对所测量的机器人原始电源电压数据进行预处理,可以显著降低时间序列数据的非平稳性和波动性.其次,利用门控循环单元(GRU)、深度置信网络(DBN)和长短期记忆(LSTM)对分解后得到的子序列进行预测建模.最后,使用Sarsa强化学习集成策略对上述三个基础预测器进行加权组合.所提出的Sarsa混合集成模型的性能在三个真实移动机器人功率数据集上得到验证.实验结果表明,运输机器人电池动力混合预测模型在鲁棒性、准确性和适应性方面具有竞争力.
Abstract
Building a rail transit workshop with efficient data interconnection has become an inevitable trend in the transformation and development of the current rail transit equipment industry.More and more diversified mobile transport robots have become a priority in the process of digital transformation of smart factories.Accurate prediction of robot battery power can guide the control center to adopt scientific and reasonable instructions in advance to ensure efficient and stable operation of the logistics transportation chain.In this study,we propose a hybrid ensemble method of multiple learners based on state-action-reward-state-action(Sarsa)reinforcement learning algorithm.Maximal overlap discrete wavelet transform(MODWT)is used to preprocess the originally measured robot power supply voltage data.This significantly reduces the non-stationarity and volatility of time series data.Gated recurrent unit(GRU),deep belief network(DBN),and long short-term memory(LSTM),are utilized for the prediction modeling of subseries after decomposition.Finally,the Sarsa reinforcement learning ensemble strategy is used to weight the three basic predictors above.The performance of the Sarsa hybrid model is verified on three real mobile robot power data sets.Experimental results elaborate that the transportation robot battery power hybrid forecasting model is competitive in robustness,accuracy,and adaptability.
关键词
机器人电源管理/运输机器人/时间序列预测/深度学习/Sarsa强化学习/集成模型Key words
robotic power management/transportation robot/time series forecasting/deep learning/Sarsa reinforcement learning/ensemble model引用本文复制引用
基金项目
Beijing New Star Program of Science and Technology,China(Z211100002121140)
National Natural Science Foundation of China(72188101)
出版年
2023