一种视/触觉融合的柔性物体抓取力估计方法

扫码查看

原文链接

万方数据
维普

中文摘要：针对柔性物体操纵问题,提出了一种视/触觉融合的柔性物体抓取力估计方法——多感知局部增强Transformer(MSLET).该方法利用模型学习每种传感器模态的低维特征,推断待抓取物体的物理特征,融合各模态的物理特征向量,用于预测抓取结果,并结合安全抓取物体的经验,推断最佳抓取力.首先,提出了用于提取视觉图像和触觉图像浅层特征的Feature-to-Patch模块,它利用2种图像的浅层特征提取图像块,进而得到它们的边缘特征,充分学习数据的特征信息,更好地推断物体的物理特征.其次,提出了用于增强局部特征的Local-Enhanced模块,对多头自注意力机制生成的图像块进行深度可分离卷积处理,以此增强局部性特征处理,促进空间维度上相邻标记之间的相关性,提高抓取结果的预测准确率.最后,对比实验表明,本文算法在保证运行效率的同时,将抓取准确率相较于当前最优模型提高了 10.19％,证明该算法能够有效估计柔性物体抓取力.

外文标题：A Visual-Tactile Fusion Method for Estimating the Grasping Force on Flexible Objects

外文摘要：To address the manipulation problem of flexible objects,a visual-tactile fusion method for estimating the grasp-ing force on flexible objects is proposed,named the MultiSense Local-Enhanced Transformer(MSLET).This approach uses a model to learn low-dimensional features from each sensor modality,infers the physical characteristics of the grasped object,and integrates these modality-specific feature vectors to predict the grasping result.By leveraging knowledge of safe grasping practices,the optimal grasping force is inferred.Firstly,the Feature-to-Patch module is developed to extract shallow features from both visual and tactile images.This module generates image patches from these shallow features,capturing their edge characteristics,thus effectively learning the feature information from data and inferring the physical properties of the grasped objects.Secondly,the Local-Enhanced module is proposed to enhance local features.Depth-wise separable convolution is applied to the image patches produced by the multi-head self-attention mechanism,to enhance the local feature processing.This increases the correlation between adjacent tokens in the spatial dimension,improving the prediction accuracy of grasp-ing results.Finally,comparative experiments demonstrate that the proposed algorithm improves the grasping accuracy by 10.19％over the state-of-the-art models while ensuring operational efficiency,thereby proving its effectiveness in estimating the grasping force on flexible objects.

外文关键词：

visual-tactile fusiongrasping force estimationphysical feature embedding

作者：

吴培良、李瑶、牛明月、陈雯柏、高国伟

展开 >

作者单位：

燕山大学信息科学与工程学院,河北秦皇岛 066004

河北省计算机虚拟技术与系统集成重点实验室,河北秦皇岛 066004

北京信息科技大学自动化学院,北京 100192

关键词：

视/触觉融合抓取力估计物理特征嵌入

出版年：

2024

DOI：

10.13973/j.cnki.robot.230328

机器人

中国自动化学会　中国科学院沈阳自动化研究所

机器人

CSTPCD北大核心

影响因子：1.134

ISSN：1002-0446

年,卷(期)：2024.46(5)