基于改进YOLO6D的单目位姿估计算法研究

Research on monocular pose estimation algorithm based on improved YOLO6D

潘江 ¹任德均 ¹史雨杭 ¹王淋楠¹

扫码查看

作者信息

1. 四川大学机械工程学院,四川成都 610065
折叠

摘要

针对当前基于单目RGB图像的杂乱场景中低纹理物体6D位姿估计算法精度不高、实时性不强、模型复杂等问题,提出了一种基于改进YOLO6D的物体位姿估计算法.用纯卷积神经网络ConvNeXt替换原算法的主干网络DarkNet-19,将网络输出经过空间金字塔池化(SPP)处理后上采样,与低层特征图拼接实现特征融合,以提高网络的特征提取能力和多尺度能力.基于Focal Loss改进损失函数以提升网络的学习能力.根据物体的先验尺寸信息和几何特征,推导出更多的2D-3D点对以提高透视投影变换PnP算法的解算精度.在LINEMOD数据集上进行了实验,实验结果表明:以2D重投影5像素阈值为指标,本文算法在12个实验对象上的平均精度达到了 95.60％,相较原算法提升了 8.14个百分点,耗时约为60ms,性能显著提升.

Abstract

Aiming at the problems of low precision,weak real-time performance and complex models of 6D pose estimation algorithm of low texture objects in cluttered scenes based on monocular RGB images,a pose estimation algorithm based on improved YOLO6D is proposed.The backbone networks of DarkNet-19 algorithm is replaced by ConvNeXt,a pure convolutional neural network(CNN).The output of the network is up-sampled after spatial pyramid pooling(SPP)processing.Then,the results are jointed with the low-level feature maps to achieve feature fusion to improve,the feature extraction capability and multi-scale capability of the network.Loss function is improved based on Focal Loss to improve learning ability of network.According to the prior size information and geometric features of the object,more 2D-3D point pairs are derived to improve the calculation precision of the perspective projection transformation PnP(perspective-n-point)algorithm.Experiments are carried out on the LINEMOD dataset,and the experimental results show that,taking the 5-pixel threshold of 2D reprojection as the index,the average precision of the proposed algorithm on 12 experimental objects reaches 95.60％,which is 8.14 percentage points higher than that of the original algorithm.The time consuming is about 60 ms and the performance is significantly improved.

关键词

6D位姿估计/单目视觉/ConvNeXt/PnP算法

Key words

6D pose estimation/monocular vision/ConvNeXt/perspective-n-point(PnP)algorithm

引用本文复制引用

出版年

2024

传感器与微系统

中国电子科技集团公司第四十九研究所

传感器与微系统

CSTPCD北大核心

影响因子：0.61

ISSN：1000-9787

参考文献量3

段落导航