Multi-agent Reinforcement Learning Algorithm Based on State Space Exploration in Sparse Reward Scenarios
In multi-agent task scenarios,a large and diverse state space is often encountered.In some cases,the reward information provided by the external environment may be extremely limited,exhibiting sparse reward characteristics.Most existing multi-agent reinforcement learning algorithms present limited effectiveness in such sparse reward scenarios,as relying only on accidentally discovered reward sequences leads to a slow and inefficient learning process.To address this issue,a multi-agent reinforcement learning algorithm based on state space exploration(MASSE)in sparse reward scenarios is proposed.MASSE constructs a subset space of states,maps one state from this subset,and takes it as an intrinsic goal,enabling agents to more fully utilize the state space and reduce unnecessary exploration.The agent states are decomposed into self-states and environmental states,and the intrinsic rewards based on mutual information are generated by combining these two types of states with intrinsic goals.By constructing a state subset space and generating intrinsic rewards based on mutual information,the states close to the target states and the states understanding the environment are rewarded appropriately.Consequently,agents are motivated to move more actively towards the goal while enhancing their understanding of the environment,guiding them to flexibly adapt to sparse reward scenarios.The experimental results indicate the performance of MASSE is superior in multi-agent collaborative scenarios with varying degrees of sparsity.