首页|基于Multi-WHFPN与SimAM注意力机制的版面分割

基于Multi-WHFPN与SimAM注意力机制的版面分割

扫码查看
作为OCR的预处理工作,版面分割技术越来越受到学术界和工业界重视.针对版面分割中遇到的检测速度慢、目标区域边界不准确以及细小目标易遗漏等问题,提出了YOLOv7-MSY模型.此模型首先借鉴残差连接思想,提出了 Multi-WHFPN网络结构.它采用可训练的权重参数,突出特征融合过程中特征重要性,并添加了小目标检测头,从而提升对小目标的检测性能;其次,引入SimAM注意力机制,可以在不增加额外参数的基础上在3D维度评估特征权重,以增强重要特征,抑制无效特征;最后,使用YEIOU来代替原模型中的定位损失函数,提升了模型的收敛速度与回归精度.在江苏省档案馆提供的数据集上进行实验对比,YOLOv7-MSY对目标区域边界检测更加敏感,对细小目标的检测效果更好.YOLOv7-MSY 的mAP@.5达到了0.871,相较于原YOLOv7模型提高了7.84%.该模型的版面分割的效果优于其他类型的版面分割算法,具有良好的泛化性能,并且版面分割速度处于较高水平.
Layout segmentation based on Multi-WHFPN and SimAM attention mechanism
As a pre-processing step for OCR,the layout segmentation technology is receiving increasing attention from both academic and industrial communities.To address the problems encountered in layout segmentation,such as slow detection speed,inaccurate boundary detection of target areas,and easy omission of small targets,the YOLOv7-MSY model is proposed.Firstly,the Multi-WHFPN network structure is proposed by combining the idea of residual connection,and trainable weighted parameters are introduced to highlight the importance of features and add a small target detection head to enhance small target detection.Secondly,the SimAM attention mechanism is introduced to evaluate feature weights in the 3D dimension without adding extra parameters,to enhance important features and suppress invalid features.Finally,the YEIOU is used to replace the original model's localization loss function,which improves the convergence speed and regression accuracy of the model.Experimental comparisons on the dataset provided by the Jiangsu Provincial Archives show that YOLOv7-MSY is more sensitive to boundary detection of target areas and performs better in detecting small targets.The mAP@.5 of YOLOv7-MSY reaches 0.871,which is 7.84%higher than the original YOLOv7 model.The layout segmentation effect of this model is superior to other types of layout segmentation algorithms.It has good generalization performance,and the layout segmentation speed is relatively high.

layout segmentationYOLOv7-MSYMulti-WHFPNSimAMYEIOU

杨陈慧、周小亮、张恒、孙政、业宁

展开 >

南京林业大学信息科学技术学院 南京 210037

南京兰台信息技术有限公司 南京 210009

版面分割 YOLOv7-MSY Multi-WHFPN SimAM注意力机制 YEIOU

国家重点研发计划

2016YFD0600101

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(1)
  • 17