首页|一种密集多尺度特征引导代价聚合的改进立体匹配网络

一种密集多尺度特征引导代价聚合的改进立体匹配网络

扫码查看
针对目前立体匹配算法在重复纹理、无纹理、边缘等不适定性区域仍存在匹配不准确的问题,提出了一种基于PSMNet的密集多尺度特征引导代价聚合的立体匹配算法—DGNet(Dense multi-scale features Guided aggregation Network).首先,基于密集连接空洞空间金字塔池化结构设计了密集多尺度特征提取模块,该模块利用不同膨胀率的空洞卷积提取不同尺度的区域级特征,并通过密集连接方式有效整合不同尺度的图像特征,使网络捕获丰富的上下文关系;其次,在每个视差等级下将左右特征图串联形成初始代价体,再提出密集多尺度特征引导代价聚合结构,在聚合代价体的同时自适应融合代价体和密集多尺度特征,从而使后续的解码层在多尺度上下文信息的引导下解码出更加精确和高分辨率的几何信息;最后,将全局优化后的高分辨率代价体送入视差回归模块以获得视差图.实验结果表明:所提算法在KITTI 2015和KITTI 2012数据集上的误匹配率分别降至1.76%和1.24%,SceneFlow数据集上的端点误差降至0.56 px,与GWCNet、CPOP-Net等先进算法相比,所提算法在不适定区域有明显改善.
Improved stereo matching network based on dense multi-scale feature guided cost aggregation
To further improve the disparity prediction accuracy of stereo matching algorithm in the ill-posed regions such as repeating textures,no texture,and edge,an improved dense multi-scale feature guided aggregation network(DGNet)based on PSMNet was proposed.Firstly,a dense multi-scale feature extraction module was designed based on the dense atrous spatial pyra-mid pooling structure.This module extracted region-level features of different scales by using at-rous convolution of different expansion rates,and effectively fused image features of different scales through dense connection,so that the network can capture contextual information.Sec-ondly,the initial cost volume was obtained by concatenating left feature maps with their corre-sponding right feature maps across each disparity level.Then,a dense multi-scale feature guided cost aggregation module was proposed,which adaptively fused the cost volume and dense multi-scale features while aggregating the cost volume,so that the subsequent decoding layers can de-code more accurate and high-resolution geometry information with the guidance of multi-scale context information.Finally,the high-resolution cost volume with global optimization was input into the regression module to obtain the disparity map.Comprehensive experimental results dem-onstrated that the mismatching rate of the proposed algorithm on KITTI 2015 and KITTI 2012 datasets was respectively reduced to 1.76%and 1.24%,and the endpoint error on SceneFlow dataset was reduced to 0.56 px.Compared with existing stereo matching algorithms such as GWCNet and CPOP-Net,the proposed algorithm performs well in the ill-posed regions.

binocular visionstereo matchingdense multi-scale featuresadaptive fusion

张博、张美灵、李雪、朱磊

展开 >

西安工程大学 电子信息学院,陕西 西安 710048

双目视觉 立体匹配 密度多尺度特征 自适应融合

国家自然科学基金陕西省自然科学基础研究计划陕西省教育厅科研计划项目自然科学专项

619713392019JQ-36119JK0361

2024

西安工程大学学报
西安工程大学

西安工程大学学报

CSTPCD
影响因子:0.473
ISSN:1674-649X
年,卷(期):2024.38(1)
  • 7