科技促进发展2024,Vol.20Issue(4) :346-355.DOI:10.11842/chips.20240408001

基于深度强化学习和遗传算法的最后一公里路径优化问题研究

Research on Last Mile Routing Optimization Based on Deep Reinforcement Learning and Genetic Algorithm

吕翊程 邵雪焱
科技促进发展2024,Vol.20Issue(4) :346-355.DOI:10.11842/chips.20240408001

基于深度强化学习和遗传算法的最后一公里路径优化问题研究

Research on Last Mile Routing Optimization Based on Deep Reinforcement Learning and Genetic Algorithm

吕翊程 1邵雪焱2
扫码查看

作者信息

  • 1. 中国科学院科技战略咨询研究院 北京 100190;中国科学院大学公共政策与管理学院 北京 100049
  • 2. 中国科学院科技战略咨询研究院 北京 100190
  • 折叠

摘要

最后一公里路径优化是提高物流企业配送效率的关键问题.本研究将深度强化学习中求解组合优化的方法(Learning to Optimize,L2O)与遗传算法相结合,提出一种混合算法,以求解最后一公里路径优化问题.在L2O模块中,扩展了已有框架,引入时间和剩余容量编码器,有效反映了问题的时间和容量约束.同时,遗传算法模块采用重启策略和采样概率调控,更充分地利用了L2O的网络信息.基于亚马逊实际业务数据构建测试集,计算结果表明,在同样的求解时间内,该算法优于Gurobi求解器和扩展的指针网络算法.

Abstract

The optimization of the last mile delivery route is a critical issue for improving the efficiency of logistics enterprises.This study proposed a hybrid algorithm combining the learning to optimize(L2O)method from deep reinforcement learning and genetic algorithm to solve the problem of last mile routing optimization.In the L2O module,an extension of the existing framework was introduced,incorporating time and remaining capacity encoders to effectively reflect the time and capacity constraints of the problem.Meanwhile,the genetic algorithm module was enhanced by employing a restart strategy and sample probability regulation to make fuller use of L2O network information.Based on actual business data from Amazon,a test set was constructed,and computational results demonstrated that within the same solving time,proposed algorithm outperformed the Gurobi solver and the extended pointer network algorithm.

关键词

深度强化学习/遗传算法/最后一公里/路径优化

Key words

deep reinforcement learning/genetic algorithm/last mile delivery/path optimization

引用本文复制引用

出版年

2024
科技促进发展
中国科学院科技政策与管理科学研究所 中国高技术产业发展促进会

科技促进发展

影响因子:0.629
ISSN:1672-996X
参考文献量4
段落导航相关论文