首页|Data on Computational Intelligence Detailed by Researchers at National Universit y of Defense Technology (Multi-ship Dynamic Weapon-target Assignment Via Coopera tive Distributional Reinforcement Learning With Dynamic Reward)

Data on Computational Intelligence Detailed by Researchers at National Universit y of Defense Technology (Multi-ship Dynamic Weapon-target Assignment Via Coopera tive Distributional Reinforcement Learning With Dynamic Reward)

扫码查看
By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News – Researchers detail new data in Machine Learning - Computational Intelligence. According to news originating from Chang sha, People’s Republic of China, by NewsRx correspondents, research stated, “In fleet air defense, the efficient coordination of multiple ships to complete weap on-target assignment has always been a critical challenge, primarily due to the varying combat capabilities and duties associated with each ship. Consequently, the traditional ‘weapon-target’ assignment mode has turned into a ‘ship-weapon-t arget’ assignment mode in the multi-ship dynamic weapon-target assignment (MS-DW TA) problem we proposed, with a larger solution space.” Our news journalists obtained a quote from the research from the National Univer sity of Defense Technology, “In this problem, different ships possess distinct a ttributes, such as defense duties, weapon types, and loaded missile quantities. To solve this problem, we proposed an Attention enhanced multiagent Distributio nal reinforcement learning method with Dynamic Reward (ADDR). Different from sta ndard reinforcement learning method, ADDR learns to estimate the distribution, a s opposed to only the expectation of future return, enabling better adaptation t o air defense scenarios with significant randomness. The multi-head attention ne twork integrates both the ship situation and the target situation to appropriate ly adjust the output of each agent, which explicitly considers the agent-level i mpact of ships to the whole fleet. Moreover, due to the missile fight time, ship s may not immediately receive rewards after executing actions. To address this d elayed phenomenon, we designed a dynamic reward mechanism to accurately adjust t he delayed rewards.”

ChangshaPeople’s Republic of ChinaAs iaComputational IntelligenceEmerging TechnologiesMachine LearningReinfor cement LearningNational University of Defense Technology

2024

Robotics & Machine Learning Daily News

Robotics & Machine Learning Daily News

ISSN:
年,卷(期):2024.(Oct.17)