小天体多节点着陆器强化学习任务规划方法

Task Planning Method based on Reinforcement Learning for Asteroid Multi-node Lander

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对中国提出的小天体多节点着陆器节点运动强耦合,着陆规划过程复杂、执行易偏差、重新规划速度慢特点,提出一种多节点着陆器强化学习任务规划方法.该方法首先将状态空间规划逻辑模型描述为矩阵,利用模型矩阵将任务规划建模为马尔科夫过程,之后通过状态哈希的广度优先搜索算法筛选随机生成状态,生成有效多节点着陆器状态空间始末状态,构建虚拟状态空间训练环境,随机切换始末状态训练强化学习智能体,提升节点的规划适应能力,实现多种条件下的任务规划.对着陆任务进行仿真,实验表明训练后智能体成功完成所有测试任务,且规划效率比POPF3规划器更高,在方案频繁调整的短序列任务上规划速度优势更大,可以更好地应用于多节点着陆器的任务规划.

外文摘要：To solve the strong coupling of node motions in the landing process of small celestial bodies which are proposed in China,the complexity of the planning process and the high real-time requirements,a multi-node lander reinforcement learning task planning method is proposed.Firstly,in this method,the state space planning logic model is described as a matrix,and the task planning is modeled as a Markov process by using the model matrix.Then the randomly generated states are filtered through the breadth-first search algorithm based on the state hash to generate the effective multi-node lander state space initial and terminal states.After that,a virtual state space is constructed to train the reinforcement learning agent by randomly switching the initial and terminal states,which improves the planning and adaptation ability of the agent with multiple constraints in task planning.Adaptation ability of the agent are analyzed for the task planning under various conditions.The simulation for landing site detection and analysis task in the landing process is performed,and the experiment shows that the agent can successfully complete all the tests after training,while the planning speed is faster compared with the POPF3 planner.The planning speed advantages in the short-sequence task when there is frequent adjustment,which can be better applied to the task planning of multi-node lander.

外文关键词：

Task planningReinforcement learningMulti-node landerState space

作者：

路思遥、徐瑞、高艾、李朝玉、王棒、朱圣英、李超博

展开 >

作者单位：

北京理工大学宇航学院,北京 100081

深空自主导航与控制工信部重点实验室(北京理工大学),北京 100081

上海卫星工程研究所,上海 201109

关键词：

任务规划强化学习多节点着陆器状态空间

基金：

国家重点研发计划空间碎片专项国家自然科学基金

项目编号：

2019YFA0706500KJSP202002030262006019

出版年：

2024

DOI：

10.3873/j.issn.1000-1328.2024.06.003

宇航学报

中国宇航学会

宇航学报

CSTPCD北大核心

影响因子：0.887

ISSN：1000-1328

年,卷(期)：2024.45(6)

参考文献量4