首页|多尺度的开放词汇目标检测

多尺度的开放词汇目标检测

扫码查看
现有的开放词汇目标检测算法在处理图像和文本对应关系时容易丢掉多尺度信息,导致对小目标检测的精度较低。针对这个问题,文中结合Channel Attention机制与特征金字塔网络构建C-FPN模块,提出C-Baron算法。在区域选择阶段,C-Baron采用区域打包对齐方法处理图像与文本的对应关系。实验表明:相对于基线模型,C-Baron在新类别和基础类别上的识别精度分别提高了2%和6。3%。
Multi-scale Open Vocabulary Target Detection
Existing open vocabulary target detection algorithms tend to discard multi-scale information when dealing with image-text correspondence,resulting in lower accuracy in small target detection.To address this issue,a C-Baron algorithm was proposed by combining the channel attention mechanism with feature pyramid networks to construct the C-FPN module.In the region selection stage,C-Baron adopted a region packing alignment method to handle the image-text correspondence.The experimental results show that compared with the baseline model,C-Baron achieves an improved recognition accura-cy of 2%for new categories and 6.3%for base categories.

open vocabulary target detectionmulti-scale informationmulti-modal processingimage-text alignmentC-FPN module

祝岚、翟亚红、徐龙艳、王杰、赵逸凡、叶子恒

展开 >

湖北汽车工业学院 电气与信息工程学院,湖北 十堰 442002

开放词汇目标检测 多尺度信息 多模态处理 图片文本对齐 C-FPN模块

2024

湖北汽车工业学院学报
湖北汽车工业学院

湖北汽车工业学院学报

影响因子:0.304
ISSN:1008-5483
年,卷(期):2024.38(3)