一种密集多尺度特征引导代价聚合的改进立体匹配网络

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据
维普

中文摘要：针对目前立体匹配算法在重复纹理、无纹理、边缘等不适定性区域仍存在匹配不准确的问题,提出了一种基于PSMNet的密集多尺度特征引导代价聚合的立体匹配算法—DGNet(Dense multi-scale features Guided aggregation Network).首先,基于密集连接空洞空间金字塔池化结构设计了密集多尺度特征提取模块,该模块利用不同膨胀率的空洞卷积提取不同尺度的区域级特征,并通过密集连接方式有效整合不同尺度的图像特征,使网络捕获丰富的上下文关系;其次,在每个视差等级下将左右特征图串联形成初始代价体,再提出密集多尺度特征引导代价聚合结构,在聚合代价体的同时自适应融合代价体和密集多尺度特征,从而使后续的解码层在多尺度上下文信息的引导下解码出更加精确和高分辨率的几何信息;最后,将全局优化后的高分辨率代价体送入视差回归模块以获得视差图.实验结果表明:所提算法在KITTI 2015和KITTI 2012数据集上的误匹配率分别降至1.76%和1.24%,SceneFlow数据集上的端点误差降至0.56 px,与GWCNet、CPOP-Net等先进算法相比,所提算法在不适定区域有明显改善.

外文标题：Improved stereo matching network based on dense multi-scale feature guided cost aggregation

外文摘要：To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures,no texture,and edge,an improved dense multi-scale feature guided aggregation network(DGNet)based on PSMNet was proposed.Firstly,a dense multi-scale feature extraction module was designed based on the dense atrous spatial pyra-mid pooling structure.This module extracted region-level features of different scales by using at-rous convolution of different expansion rates,and effectively fused image features of different scales through dense connection,so that the network can capture contextual information.Sec-ondly,the initial cost volume was obtained by concatenating left feature maps with their corre-sponding right feature maps across each disparity level.Then,a dense multi-scale feature guided cost aggregation module was proposed,which adaptively fused the cost volume and dense multi-scale features while aggregating the cost volume,so that the subsequent decoding layers can de-code more accurate and high-resolution geometry information with the guidance of multi-scale context information.Finally,the high-resolution cost volume with global optimization was input into the regression module to obtain the disparity map.Comprehensive experimental results dem-onstrated that the mismatching rate of the proposed algorithm on KITTI 2015 and KITTI 2012 datasets was respectively reduced to 1.76%and 1.24%,and the endpoint error on SceneFlow dataset was reduced to 0.56 px.Compared with existing stereo matching algorithms such as GWCNet and CPOP-Net,the proposed algorithm performs well in the ill-posed regions.

外文关键词：

binocular visionstereo matchingdense multi-scale featuresadaptive fusion

作者：

张博、张美灵、李雪、朱磊

展开 >

作者单位：

西安工程大学电子信息学院,陕西西安 710048

关键词：

双目视觉立体匹配密度多尺度特征自适应融合

基金：

国家自然科学基金陕西省自然科学基础研究计划陕西省教育厅科研计划项目自然科学专项

项目编号：

619713392019JQ-36119JK0361

出版年：

2024

DOI：

10.13338/j.issn.1674-649x.2024.01.016

西安工程大学学报

西安工程大学

西安工程大学学报

CSTPCD

影响因子：0.473

ISSN：1674-649X

年,卷(期)：2024.38(1)

参考文献量7