基于渲染视角采样的轻量化模板匹配算法

Lightweight Template Matching Algorithm Based on Rendering Perspective Sampling

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：位姿估计作为一种经典的计算机视觉感知任务,常用于自动驾驶和机器人抓取等场景.基于深度学习的模板匹配位姿估计算法对未知场景的鲁棒性极强,但是当前方法普遍存在显存消耗大且运行速度慢的问题.为此,提出一种轻量化深度学习模板匹配算法.该方法引入深度可分离卷积和注意力机制,在大幅减少模型参数量的前提下,提取更具泛化性的图片特征,提高对未见物体和被遮挡物体的位姿估计精度.此外,提出渲染视角迭代采样优化器,仅增加少量渲染模板来对初始估计结果进行优化,极大提高算法运行速度,并保证匹配精度.开源数据集上的实验结果表明,所提轻量化模型的参数量仅为先进模板匹配模型参数量的0.179%,在不需要高质量渲染模板的条件下,将平均匹配精度提高了3.834%.

外文摘要：As a classical computer vision perception task,pose estimation is commonly used in scenarios such as autonomous driving and robot grasping.The pose estimation algorithm based on template matching is advantageous in detecting new objects.However,current state-of-the-art template matching methods based on convolutional neural networks generally suffer from large memory consumption and slow speed.To solve these problems,this paper proposes a deep learning-based lightweight template matching algorithm.The method,which incorporates depth-wise convolution and the attention mechanism,drastically reduces the number of model parameters and has the capability to extract more generalized image features.Thus,the accuracy of position estimation for unseen and occluded objects is improved.In addition,this paper proposes an iterative rendering perspective sampling strategy to significantly reduce the number of templates.Experiments on open-source datasets show that the proposed lightweight model utilizes only 0.179%of the parametric quantity of the commonly used template matching model,while enhancing the average pose estimation accuracy by 3.834%.

外文关键词：

template matchingpose estimationlightweight modelconvolutional neural network

作者：

文代洲、王晰、任明俊

展开 >

作者单位：

上海交通大学机械与动力工程学院,上海 200240

关键词：

模板匹配位姿估计轻量化模型卷积神经网络

基金：

国家自然科学基金面上项目

项目编号：

52175477

出版年：

2024

DOI：

10.3788/LOP240469

激光与光电子学进展

中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCD北大核心

影响因子：1.153

ISSN：1006-4125

年,卷(期)：2024.61(18)

参考文献量6