广东工业大学学报2024,Vol.41Issue(3) :102-109.DOI:10.12052/gdutxb.230011

嵌入拓扑特征的自然场景文本检测方法

Text Detection in Natural Scenes Embedded Topological Feature

郑侠聪 程良伦 黄国恒 王敬超
广东工业大学学报2024,Vol.41Issue(3) :102-109.DOI:10.12052/gdutxb.230011

嵌入拓扑特征的自然场景文本检测方法

Text Detection in Natural Scenes Embedded Topological Feature

郑侠聪 1程良伦 1黄国恒 1王敬超1
扫码查看

作者信息

  • 1. 广东工业大学 计算机学院,广东 广州 510006
  • 折叠

摘要

传统的基于锚点框(anchor box)实现的自然场景文本检测方法中,锚点框容易受到其他文本实例的干扰产生误判或精度降低,且文本实例包含强烈的拓扑特征但并未得到重视,导致在弯曲环形文本检测任务中表现不佳.针对这个问题提出了一种新颖的神经网络结构,引入图卷积神经网络的概念,充分考虑邻近锚点框之间的联系,并融入锚点框的拓扑特征辅助图神经网络的学习,提高整体网络的有效性.在两个公开的自然场景文本检测数据集上进行了消融实验,在公开数据集CTW1500中,本文提出的方法使模型在召回率、精确率、F分数这3个指标上分别提高了3.0%、1.9%以及2.5%,在公开数据集Totel-Text中这3个指标分别是2.2%、1.8%以及2.0%.此外,本文方法还与近年提出的其他文本检测算法进行了比较,实验结果证明本文提出的方法在复杂自然场景下文本检测效果优秀,所提出的模块有利于文本检测性能的提高.

Abstract

In traditional anchor box-based text detection methods for natural scenes,anchor boxes are prone to interference from other text instances,resulting in erroneous judgments or affecting accuracy.Moreover,text instances contain strong topological features,which are usually be ignored,resulting in poor performance in curved circular text detection tasks.To solve this problem,a novel neural network structure is proposed,which introduces the concept of graph convolutional networks by fully considering the relationship between adjacent anchor frames,and incorporating the topological characteristics of anchor frames to assist the learning of graph neural networks,improving the effectiveness of the overall network.The ablation experiments were conducted on two publicly available natural scene text detection datasets.In the CTW1500 dataset,the proposed method improved the model by approximately 3.0%,1.9%,and 2.5%in terms of recall,accuracy,and F-score,respectively,and in the Totel-Text dataset,the three values were improved by approximately 2.2%,1.8%,and 2.0%,respectively.In addition,the proposed method has also been compared with other text detection algorithms proposed in recent years.Experimental results show that the proposed method performs well for text detection in complex natural scenes,demonstrating the promising effectiveness of the proposed module for improving the performance of text detection.

关键词

文本检测/自然场景/图神经网络/拓扑特征

Key words

text detection/natural scene/graph convolutional networks(GCN)/topological feature

引用本文复制引用

基金项目

国家自然科学基金(U20A6003)

国家自然科学基金-广东省联合基金(U1801263)

国家自然科学基金-广东省联合基金(U1701262)

国家自然科学基金-广东省联合基金(U2001201)

广东省信息物理融合系统重点实验室项目(2020B1212060069)

佛山市重点领域科技攻关计划(2020001006832)

出版年

2024
广东工业大学学报
广东工业大学

广东工业大学学报

影响因子:0.628
ISSN:1007-7162
参考文献量24
段落导航相关论文