首页|Balanced single-shot object detection using cross-context attention-guided network
Balanced single-shot object detection using cross-context attention-guided network
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NSTL
Elsevier
In real-world application scenarios, object detection usually encounters two technical challenges, i.e., high accuracy and high speed. Although the latest detection frameworks based on anchor-free detection have achieved outstanding performance, they cannot be widely used in real-world scenarios due to their model complexity and slow speed. In this paper, inspired by cross-context attention mechanism of human visual systems, we propose a light but effective single-shot detection framework using Cross-context Attention-guided Network (CCAGNet) to balance the accuracy and speed. CCAGNet uses attention-guided mechanism to highlight the interaction of object-synergy regions, and suppresses non-object-synergy regions by combining Cross-context Attention Mechanism (CCAM), Receptive Field Attention Mechanism (RFAM), and Semantic Fusion Attention Mechanism (SFAM). The main contribution of our work includes establishing a novel attention mechanism that takes the context information of channel, spatial, cross and adjacent-regions into consideration simultaneously. Extensive experiments demonstrate the feasibility and effectiveness of our method on the public benchmark datasets. To the best of our knowledge, CCAGNet obtains the state-of-the-art performance on both PascalVOC and MSCOCO with the excellent trade-off between accuracy and speed among single-shot detectors. Especially, the Average Precision (AP) metric is significantly improved by 17.0% on small object detection on MSCOCO. (c) 2021 Published by Elsevier Ltd.
Cross-context attention-guided networkCross-context attention mechanismReceptive field attention mechanismSemantic fusion attention mechanismAccuracy and speed balanceJOINT