A Study on the Precise Navigation of Live Working Manipulators Based on Double Deep Q-Learning Network
In order to achieve the precise navigation of the live working manipulator(robot arms)in the power grid,the global weighted reward mechanism is proposed,and an advanced accurate navigation model of the manipulator based on the mechanism of global weighted reward and the algorithm of double-depth Q network is built to solve the issue of Q-value overestimation and low update efficiency.The obstacle avoidance and navigation of the robotic arms during the cross-line operation are studied,and the result shows that the best learning rate is 0.005 and the global weighted reward mechanism,compared to the immediate reward of the current state,can more effectively improve the efficiency of Q-value updates;and the convergence deviation of the cross-line operation model based on the global weighted reward mechanism and the double-depth Q network algorithm reduces to±6.45.The advanced precise navigation model of the DDQN robot arm established based on the global weighted reward mechanism has stronger generalization performance and realizes the accurate navigation of the robot live operation.
live workingmanipulatordeep reinforcement learningdouble deep Q-learning networkautonomous navigation