Optimization method for hazardous freight transportation routes based on multi-agent meta-reinforcement learning
At the route optimization problem of hazardous material transportation vehicles,a trans-portation company using multiple vehicles to serve all customers is described using a multi-agent system to enhance the collaborative efficiency among vehicles.The optimization objectives are to minimize the travel time and transportation risk while considering the time window and load capaci-ty.Through construction of the multi-agent reinforcement learning model and the application of the meta-RL method,a meta-model with greater generalization ability was established.The hazardous freight transportation problem with different weighting schemes is abstracted as subtasks to opti-mize multivehicle and multitrip routes with time windows.Different embedding layers of deep neu-ral network models are leveraged to capture the high-dimensional features of the subtasks.By effec-tively combining the meta-learning reptile algorithm with the rolling baseline approach,our method enhances adaptability to different subtasks and improves flexibility in solving computations by greedily selecting actions with the highest probabilities.The experimental results demonstrate that the proposed multi-agent meta-reinforcement learning method outperforms transfer reinforcement learning methods,achieving a 12%improvement in the non-dominated point count and a 22%im-provement in the hypervolume.Thus,the proposed method is closer to the Pareto-optimal solution.Furthermore,among the different decoding methods,beam search sampling exhibits superior perfor-mance.