基于级联融合网络的密集型抓取位姿检测
Intensive grasp pose detection based on cascaded fusion network
邓鹏 1唐文涛 1黄开明2
作者信息
- 1. 荆楚理工学院电子信息工程学院,荆门 448000
- 2. 武汉大学电子信息学院,武汉 430072
- 折叠
摘要
针对机械臂在密集场景下对未知物体进行平面抓取时检测精度低、耗时长的问题,提出了一种新的抓取网络结构,称为级联融合抓取检测(Cascade Fusion Grasp Detection,CFGD)网络,用于预测密集场景下物体的抓取位姿.在提出的用于提取初始高分辨率特征的主干和几个块的基础上,引入几个级联阶段来生成CFGD网络中的多尺度特征.每个阶段包括1个用于特征提取的子主干和1个用于特征集成的极其轻量级的过渡块,该设计使得整个主干参数比例大,特征融合更深入、更有效.所提算法在康奈尔抓取数据集、提花抓取数据集和自定义数据集上在检测精度与速度上较现有算法有明显提升.在真实抓取场景下,单目标场景下的抓取成功率为98.6%,在密集场景下的抓取成功率为94.6%o试验结果表明,所提算法以较高的准确率预测和抓取了密集场景下的未知物体.
Abstract
In order to solve the problem of low detection accuracy and long time when robot arms grasp unknown objects in dense scenes,a new grasping network structure called Cascade Fusion Grasp Detection(CFGD)was proposed,which was used to pre-dict the grasping pose of objects in dense scenes.Based on the proposed backbone and several blocks for extracting initial high-resolution features,several cascade stages were also introduced to generate multi-scale features in CFGD network.A sub-backbone for feature extraction and an extremely lightweight transition block for feature integration were included at each stage.Such a de-sign allowed for a larger proportion of parameters throughout the backbone and a deeper and more efficient feature fusion.Com-pared with the existing algorithms,the proposed algorithm has significantly improved the detection accuracy and speed on Cornell grasp data set,Jacquard grasp data set and custom data set.In the real grasping scene,the grasping success rate is 98.6%in the single target scene and 94.6%in the dense scene.Experimental results showed that the proposed algorithm can predict and grasp unknown objects in dense scenes with high accuracy.
关键词
机械臂/平面抓取/位姿估计/级联融合/机器视觉Key words
mechanical arm/plane grasping/pose estimation/cascade fusion/machine vision引用本文复制引用
基金项目
国家自然科学基金项目(42174189)
湖北省高等学校优秀中青年科技创新团队项目(T2021028)
荆门市科技计划项目(2023YFYB040)
出版年
2024