基于多分支架构的目标三维位姿实时估计方法
Real-time estimation method of target 3D pose based on multi-branch architecture
洪勇 1刘进 2罗书培 3陈新 3李德仁 2张庆4
作者信息
- 1. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430070;移动广播与信息服务产业创新研究院(武汉)有限公司,鄂州 436031
- 2. 武汉大学 测绘遥感信息工程国家重点实验室,武汉 430070
- 3. 移动广播与信息服务产业创新研究院(武汉)有限公司,鄂州 436031
- 4. 哈尔滨工程大学 智能科学与工程学院,哈尔滨 150001
- 折叠
摘要
针对车路协同应用场景中目标尺度与位置值域跨度大导致的位姿实时估计精度低、回归求解模型收敛慢的问题,提出了一种基于多分支架构的目标三维位姿实时估计方法.在目标二维检测算法模型架构基础上,设计位姿估计分支结构,用于输出目标姿态四元数、目标相对相机的三维空间位置及目标长宽高.在训练阶段,为位姿估计分支设计了相应的损失函数,采用向量单位化算子回归姿态四元数、对数重映射算法回归目标尺寸与目标到相机距离;在推理阶段,根据模型输出的四元数和目标到相机距离解算出目标三维位姿,实现精确的位姿估计.在 OVRC 数据集的位姿精度验证中,位置坐标误差的均方差最大为 1.94 m,姿态角误差的均方差最大为 3.98°;在 Kitti 数据集的相对精度测试实验中,相比 PVNet 方法,检测精度提升了 3.22%,相比 3DBB 方法,推理效率提升一倍.
Abstract
A real-time method for estimating object 3D pose based on a multi-branch architecture is proposed,aiming to solve the issues of low precision in real-time pose estimation and slow convergence of regression solving models caused by the large scale and range of target dimensions and positions in the vehicle-road cooperative application scenario.On the basis of the model architecture of the target 2D detection algorithm,a branch structure for position estimation is designed for outputting the target pose quaternion,the 3D spatial position of the target relative to the camera and the target size.In the training stage,corresponding loss functions are designed for the pose estimation branch,in which the vector unitisation operator is used to regress the pose quaternion,and the logarithmic remapping algorithm is used to regress the target dimensions and target-to-camera distances.In the inference stage,the 3D pose of the target is solved based on the quaternion and target-to-camera distances output from the model,so as to achieve the accurate pose estimation.In the pose accuracy verification of the OVRC dataset,the maximum mean square error of the position coordinate is 1.94 m,and the maximum mean square error of the attitude angle estimation result is 3.98°.In the relative accuracy test experiment of the Kitti dataset,the detection accuracy is improved by 3.22%compared with the PVNet method,and the inference efficiency is improved by 1 times compared with the 3DBB method.
关键词
多分支架构/三维位姿/目标检测/位姿估计/姿态四元数Key words
multi-branch architecture/three-dimensional pose/object detection/pose estimation/quaternion orientation引用本文复制引用
基金项目
湖北省科技重大专项(2020AAA004)
海南省科技创新联合项目(2021CXLH0001)
出版年
2024