For the unmanned aerial vehicle(UAV)formation path planning problem in unknown dynamic environment,an intelligent decision scheme for UAV formation based on multi-agent twin delayed deep deterministic strategy gradient algorithm incorporating dynamic formation reward function(MATD3-IDFRF)algorithm is proposed.Firstly,the sparsity reward function is extended for the obstacle-free environment.Then,the dynamic formation problem,which is the focus of attention in UAV formation path planning,is analyzed in depth.It is described as a UAV formation flying in a stable formation structure and a fine-tuning of the formation in time according to the surrounding environment.The essence of the analysis is that the spacing between each two UAVs remains relatively stable,while it is also fine-tuned by the external environment.A reward function based on the optimal distance and current distance between each pair of UAVs is designed,leading to the proposal of a dynamic formation reward function,and which is then combined with the multi-agent twin delayed deep deterministic(MATD3)algorithm to propose the MATD3-IDFRF algorithm.Finally,comparison experiments are designed,and the dynamic formation reward function presented in this paper can improve the algorithm success rate by 6.8%,while improving the converged reward average by 2.3%and reducing the formation deformation rate by 97%in the complex obstacle environment.