Method for Optimizing Parameters of Deep Q Network based on Evolutionary Algorithms
The study aims to address the issues of blind search,uneven exploration-exploitation and slow convergence in the early stages of DQN(Deep Q Network).From the perspective of effective information acquisition and utilization beneficial for algorithm training and with Differential Evolution(DE)algorithm as an example,the paper presents a method named DE-DQN for optimizing the parameters of the DQN network based on evolutionary algorithms,aiming to accelerate its convergence speed.Firstly,the network parameters of DQN are encoded as evolutionary individuals.Secondly,two fitness evaluation metrics,"run length"and"average return"are employed separately.The effectiveness of the two evaluation methods is verified through simulation comparisons using the CartPole control problem.Finally,the experimental results indicate that training for 5000 generations,the proposed algorithm increases by 82.7%,18.1%,and 25.1%in run length,average return,and cumulative return,respectively,when"run length"is used as the fitness function and by 74.9%,18.5%,and 13.3%in run length,average return,and cumulative return,respectively,when"average return"is used as the fitness function,outperforming the improved DQN algorithm.It is concluded that compared to traditional DQN and its improved algorithms,the DE-DQN algorithm can acquire more useful information in the early stages and,therefore,accelerate the convergence speed.
deep reinforcement learningdeep Q networkconvergence accelerationevolution algorithmsautomatic control