首页|基于改进Sarsa算法的拖轮动态调度方法

基于改进Sarsa算法的拖轮动态调度方法

扫码查看
从优化Sarsa算法的角度展开拖轮动态调度方法研究.采用强化学习框架并结合拖轮的状态和环境信息,建立状态-动作函数,进而搜索拖轮调度最佳策略.改进Sarsa算法中Q函数的更新方式,以克服收敛速度慢的问题.同时,根据学习率选择模式和动作选择方式,对探索策略与利用策略加以平衡,以提高算法的收敛速度和性能.算例仿真实验结果表明,采用该算法可有效缩短船舶等待时间,进而提升拖轮资源的利用效率.
Tugboat Dynamic Scheduling Method Based on Improved Sarsa Algorithm
Aiming at the shortcomings of the traditional Sarsa algorithm,the optimization of tugboat dynamic schedu-ling method is studied.Based on the reinforcement learning framework and the state and environment information of tugboats,the state-action function is established to search the optimal strategy of tugboats scheduling decision.The update method of Q function in Sarsa algorithm is changed to overcome the problem of slow convergence.At the same time,according to the learning rate and action selection mode,the exploration strategy and utilization strategy are balanced to improve the convergence speed and performance of the algorithm.The simulation results show that the algorithm can effectively shorten the waiting time and improve the utilization efficiency of tugboat resources.

Sarsa algorithmtugboatsadaptive schedulingreinforcement learningalgorithm strategy

李佳琛、段兴锋

展开 >

集美大学 航海学院,福建 厦门 361000

Sarsa算法 拖轮 自适应调度 强化学习 算法策略

福建省自然科学基金项目

2019J01325

2024

重庆科技学院学报(自然科学版)
重庆科技学院

重庆科技学院学报(自然科学版)

影响因子:0.329
ISSN:1673-1980
年,卷(期):2024.26(3)
  • 6