首页|一种基于特征增强的场景文本检测算法

一种基于特征增强的场景文本检测算法

扫码查看
针对自然场景下图像文本复杂背景、尺度多变等造成的漏检、误检问题,提出了一种基于特征增强的场景文本检测算法.在特征金字塔融合阶段,提出了双域注意力特征融合模块(Dual-domain Attention Feature Fusion Module,D2AAFM).该模块能够更好地融合不同语义和尺度的特征图信息,从而提高文本信息的表征能力.同时,考虑到网络深层特征图在上采样融合过程中出现语义信息损失的问题,提出了多尺度空间感知模块(Multi-scale Spatial Perception Module,MSPM),通过扩大感受野来获取更大感受野的上下文信息,增强深层特征图的文本语义信息特征,从而有效地减少文本漏检、误检.为了评估所提算法的有效性,在公开数据集ICDAR2015,CTW1500以及 MSRA-TD500上进行实验,所提方法综合指标F值分别达到了82.8%,83.4%和85.3%.实验结果表明,该算法在不同数据集上都具有良好的检测能力.
Scene Text Detection Algorithm Based on Feature Enhancement
To address the problem of missed and false detection of image text in natural scenes due to complex backgrounds and variable scales,this paper proposes a text detection algorithm for scenes based on feature enhancement.In the feature pyramid fu-sion stage,a dual-domain attention feature fusion module(D2AAFM)is proposed,which can better fuse feature map information of different semantics and scales,thus improving the characterization ability of text information.At the same time,considering the problem of semantic information loss in the process of up-sampling and fusion of deeper feature maps of the network,the multi-scale spatial perception module(MSPM)is proposed to enhance the semantic features of text in higher-level feature maps by ex-panding the perceptual field to obtain contextual information of a larger perceptual field,thus effectively reduce the text of missed and false detection.In order to evaluate the effectiveness of the proposed algorithm,it is tested on the publicly available datasets ICDAR2015,CTW1500 and MSRA-TD500,and its overall index F-value reaches 82.8%,83.4%and 85.3%,respectively.The experimental results show that the algorithm has good detection capability on different datasets.

Deep learningScene text detectionAttention mechanismsMulti-scale feature fusionDilated convolution

高楠、张雷、梁荣华、陈朋、付政

展开 >

浙江工业大学计算机科学与技术学院 杭州 310023

深度学习 场景文本检测 注意力机制 多尺度特征融合 空洞卷积

国家自然科学基金国家自然科学基金国家自然科学基金国家重点研发计划

6170245662036009U19092032020YFB1707700

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(6)
  • 26