首页|基于Deformable DETR的自然场景任意形状文本检测

基于Deformable DETR的自然场景任意形状文本检测

扫码查看
自然场景下的文本区域形状复杂多变,直接使用轮廓坐标描述文本区域会使得建模不充分,导致文本检测准确性低.针对自然场景下文本区域不规则的问题,提出了一种基于Deformable DETR的任意形状文本检测模型,不同于传统的直接预测轮廓点的方法,使用B-样条对文字区域进行建模使得文本轮廓平滑精确的同时减少了需要预测的参数.提出的文本检测模型无需手工设计锚点、区域建议等组件,极大地简化了模型设计并提高了通用性.提出的模型在无需额外数据集的情况下在任意形状文本数据集CTW1500和Total-Text上的平均精度(F值)分别达到了 85.4%和85.0%,证明了模型的有效性.
Arbitrary-shaped Text Detection Based on Deformable DETR
Text regions in natural scenes have complex and variable shape.Directly use contour coordinates to describe text regions will make the modeling inadequate and lead to low accuracy of text detection.To address the problem of irregular text regions in natural scenes,an arbitrary-shaped text detection model based on Deformable DETR is proposed.The model differs from the traditional method of directly predicting contour points by using B-Spline to make the text contour smoother and more accurate and reduces the number of predictable parameters at the same time.The proposed text detection model eliminates the need to manually design components such as anchor and region proposal.The model greatly simplifies the design and makes it more generalizable.The proposed model achieves F-measure of 85.4%and 85.0%on CTW1500 and Total-Text,which demonstrate the effectiveness of the model.

computer visionnatural scene text detectionDeformable DETRB-Spline

张子旭、游钰玮、仝明磊、薛亮

展开 >

上海电力大学 电子与信息工程学院,上海 201306

上海电力大学数理学院,上海 201306

计算机视觉 自然场景文本检测 Deformable DETR B-样条

国家自然科学基金

62105196

2024

无线电工程
中国电子科技集团公司第五十四研究所

无线电工程

影响因子:0.667
ISSN:1003-3106
年,卷(期):2024.54(2)
  • 1