控制与决策2024,Vol.39Issue(3) :777-785.DOI:10.13195/j.kzyjc.2022.0812

融入注意力机制的多模特征机械臂抓取位姿检测

Multi-modal feature robotic arm grasping pose detection with attention mechanism

楚红雨 冷齐齐 张晓强 常志远 邵延华
控制与决策2024,Vol.39Issue(3) :777-785.DOI:10.13195/j.kzyjc.2022.0812

融入注意力机制的多模特征机械臂抓取位姿检测

Multi-modal feature robotic arm grasping pose detection with attention mechanism

楚红雨 1冷齐齐 1张晓强 1常志远 1邵延华1
扫码查看

作者信息

  • 1. 西南科技大学信息工程学院,四川绵阳 621010
  • 折叠

摘要

针对机械臂抓取检测任务中对未知物体抓取位姿检测精度低、耗时长等问题,提出一种融入注意力机制多模特征抓取位姿检测网络.首先,设计多模态特征融合模块,在融合多模态特征同时对其赋权加强;然后,针对较浅层残差网络提取重点特征能力较弱的问题,引入卷积注意力模块,进一步提升网络特征提取能力;最后,通过全连接层对提取特征直接进行回归拟合,得到最优抓取检测位姿.实验结果表明,在Cornell公开抓取数据集上,所提出算法的图像拆分检测精度为98.9%,对象拆分检测精度为98.7%,检测速度为51FPS,对10类物体的100次真实抓取实验中,成功率为95%.

Abstract

To address the problems of low accuracy and time consuming detection of unknown object grasping pose in the robotic arm grasping detection task,a multi-modal feature grasping pose detection network with attention mechanism is proposed.Firstly,a multi-modal feature fusion module is designed to fuse the multi-modal features and enhance their weighting.Then,to address the problem that the shallow residual network is weak in extracting key features,a convolutional attention module is introduced to further improve the feature extraction ability of the network.Finally,the optimal grasp detection pose is obtained by direct regression fitting of the extracted features through the fully connected layer.The experimental results show that the detection accuracy of image splitting and object splitting on the Cornell grasp dataset is 98.9%and 98.7%respectively,and the detection speed is 51 FPS.The success rate is 95%for 100 real-world grabs of 10 types of objects.

关键词

目标抓取/位姿检测/机械臂/注意力机制/多模态特征/深度学习

Key words

target grasping/pose detection/robotic arms/attention mechanisms/multi-modal features/deep learning

引用本文复制引用

基金项目

国防科工局项目([2019]1276)

国家自然科学基金(12175187)

西南科技大学博士基金(19zx7123)

出版年

2024
控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
参考文献量24
段落导航相关论文