hierarchical reinforcement learningfixed offensive strategyoption architecturedeterministic gradi-ent policy
National Natural Science Foundation of ChinaNational Key Research and Development ProgramShanghai Commercial Aircraft System Engineering Joint Research Fund
616732652020YFC1512203CASEF-2022-Z05
2024