首页|面向未知动态环境的机器人搜救任务避障算法

面向未知动态环境的机器人搜救任务避障算法

扫码查看
本文研究未知动态环境下具有多个兴趣目标的移动机器人搜救任务问题.由于移动机器人需要搜救多个目标并避开障碍,此类问题具有挑战性.为确保移动机器人合理避碰,本文提出一种基于混合策略纳什均衡的Dyna-Q算法(MNDQ).首先,引入一种多目标分层结构以简化问题,该结构将整个任务划分为多个子任务,包括搜索目标和躲避障碍.其次,提出基于动态风险相对位置的风险监测机制,使机器人避免潜在碰撞和绕路.此外,为提高采样效率,提出了结合Dyna-Q和混合策略纳什均衡的强化学习方法(MNDQ).根据混合策略纳什均衡,智能体以概率的形式做出决策从而最大化期望回报,提高Dyna-Q算法的整体性能.最后,通过仿真实验验证所提方法的有效性.结果表明,该方法具有良好的表现并为未来的机器人自主导航任务提供了解决思路.
An anti-collision algorithm for robotic search-and-rescue tasks in unknown dynamic environments
This paper deals with the search-and-rescue tasks of a mobile robot with multiple interesting targets in an unknown dynamic environment.The problem is challenging because the mobile robot needs to search for multiple targets while avoiding obstacles simultaneously.To ensure that the mobile robot avoids obstacles properly,we propose a mixed-strategy Nash equilibrium based Dyna-Q(MNDQ)algorithm.First,a multi-objective layered structure is introduced to simplify the representation of multiple objectives and reduce computational complexity.This structure divides the overall task into subtasks,including searching for targets and avoiding obstacles.Second,a risk-monitoring mechanism is proposed based on the relative positions of dynamic risks.This mechanism helps the robot avoid potential collisions and unnecessary detours.Then,to improve sampling efficiency,MNDQ is presented,which combines Dyna-Q and mixed-strategy Nash equilibrium.By using mixed-strategy Nash equilibrium,the agent makes decisions in the form of probabilities,maximizing the expected rewards and improving the overall performance of the Dyna-Q algorithm.Furthermore,a series of simulations are conducted to verify the effectiveness of the proposed method.The results show that MNDQ performs well and exhibits robustness,providing a competitive solution for future autonomous robot navigation tasks.

Search and rescueReinforcement learningGame theoryCollision avoidanceDecision-making

陈洋、史殿习、杨焕焕、李彤月、王震

展开 >

北京大学计算机学院,中国 北京市,100871

天津(滨海)人工智能创新中心,中国 天津市,300457

智能博弈与决策实验室,中国 北京市,100071

国防科技大学计算机学院,中国长沙市,410073

展开 >

搜索救援 强化学习 博弈论 避碰 决策问题

国家自然科学基金

91948303

2024

信息与电子工程前沿(英文)
浙江大学

信息与电子工程前沿(英文)

CSTPCD
影响因子:0.371
ISSN:2095-9184
年,卷(期):2024.25(4)
  • 62