基于增强多层次特征融合的自然场景文本检测
Natural scene text detection based on enhanced multi-level feature fusion
周燕 1韦勤彬 1廖俊玮 1曾凡智 1刘翔宇 1周月霞1
作者信息
- 1. 佛山科学技术学院电子信息工程学院,广东佛山 528225
- 折叠
摘要
针对自然场景图像中未聚焦小文本、复杂背景文本以及宽间距弯曲文本等造成的检测难题,提出了一种基于增强多层次特征融合的自然场景文本检测方法,该方法包括局部注意力特征增强(Local Attention,Feature,Enhanced,LAFE)模块和多层次增强特征融合(Multi-level8Enhanced,Feature,Fused,MEFF)模块.LAFE模块通过堆叠空洞卷积扩大网络感受野,结合通道与空间注意力来增强像素点分类能力;MEFF模块作为多层次增强特征连接分支,引入可变形卷积来增强特征图之间的信息融合.实验结果表明,所提方法在常用文本数据集上取得了较好的性能,其中,在ICDAR2015、Total-Text数据集上的综合指标F分别达到了 88.1%和 86.5%,相比原方法分别提升了 0.8%和 1.8%.
Abstract
Aiming at the detection problems caused by unfocused small text,complex background text and wide-spaced curved text in natural scene images,a natural scene text detection method based on enhanced multi-level feature fusion was proposed.It includes the Local Attention Feature Enhanced(LAFE)module and the Multi-level Enhanced Feature Fused(MEFF)module.LAFE module expands the sensory field of the network by stacking dilated convolution and enhances the classification ability of pixels by combining channels and spatial attention.MEFF module,as a multi-level enhanced feature connection branch,introduces deformable convolution to enhance the information fusion between feature graphs.Experimental results show that the proposed method has good performance on common text data sets.Among them,the comprehensive index F of ICDAR2015 and Total-Text data sets reached 88.1%and 86.5%,respectively,which increased by 0.8%and 1.8%compared with the original method.
关键词
自然场景文本检测/注意力机制/像素点分类/空洞卷积/特征融合Key words
natural scene text detection/attention mechanism/pixel point classification/dilated convolution/feature fusion引用本文复制引用
基金项目
国家自然科学基金资助项目(61972091)
广东省自然科学基金资助项目(2022A1515010101)
广东省自然科学基金资助项目(2021A1515012639)
出版年
2024