AUV path planning based on improved Q-learning algorithm
A lightweight improved Q-learning algorithm is proposed for the underactuated AUV global path planning problem.The distance reward function is designed to accelerate the learning rate and improve algorithm stability.The com-bination of epsilon-greedy strategy and Softmax strategy provides a mechanism to balance exploration and exploitation.The algorithm simplifies the action set based on AUV motion constraints to improve computational time.Simulation results demonstrate that the proposed algorithm efficiently solves the AUV path planning problem,enhancing algorithm stability and applicability.Compared to traditional Q-learning algorithms,when performing short-distance tasks,the learning effi-ciency is increased by 90%,the path length is reduced by 7.85%,and the number of turns is reduced by 14.29%.When per-forming long-distance tasks,the learning efficiency is improved by 67.5%,the path length is reduced by 6.10%,and the num-ber of turns is reduced by 32.14%.