基于数据增强和ViT的印章识别方法研究

A Study on Seal Recognition Method Based on Data Augmentation and Vision Transformer

张志剑 ¹夏苏迪 ²刘政昊 ¹王文慧 ¹陈帅朴 ¹霍朝光³

扫码查看

作者信息

1. 武汉大学信息管理学院,武汉 430072;武汉大学大数据研究院,武汉 430072;武汉大学信息资源研究中心,武汉 430072
2. 南京中医药大学卫生经济管理学院,南京 210023
3. 中国人民大学信息资源管理学院,北京 100872
折叠

摘要

印章识别因采集标注困难和印章图像退化等导致识别难度较大.数据增强可以缓解数据缺乏的困境,结合ViT(vision transformer)模型提取印章的全局特征,可以提高复杂情境下的印章识别能力.首先根据印章所处的情境特点进行分析,针对分析结果制定数据增强策略,进而扩充训练集;然后将印章图像输入ViT模型中,进行特征提取和印章识别.本文采集并标注《兰亭序》等16幅书法字画上包含的1259枚印章,经过11个数据增强模块处理后,训练集包含127159枚印章图像.与基线模型ResNet50相比,ViT模型的F1值提高了12.17个百分点,去除数据增强所得扩展数据后,所有模型均无法收敛.在标注数据较少的情况下,通过数据增强和ViT模型可以对印章图像进行准确识别.本文方法尚缺少语义推理能力,无法识别训练集中未出现的印章.

Abstract

Seal recognition poses challenges due to difficulties in data collection,annotation,and image degradation.This study aims to alleviate data scarcity through data augmentation and improve the model's ability to recognize seals in com-plex scenarios by using the vision transformer(ViT)model to extract global features.First,the contextual characteristics of the seals are analyzed,implementing data augmentation strategies based on the analysis results to expand the training set.Seal images are then input into the ViT model for feature extraction and recognition.We collected and annotated 1,259 seals from 16 calligraphy and painting works,such as"Lanting Xu."After applying 11 data augmentation modules,the training set expanded to include 127,159 seal images.Compared with the baseline model ResNet50,the F1 score improved by 12.17%.When the extended data obtained through data augmentation is removed,all models fail to converge.However,the proposed method lacks semantic reasoning ability and cannot recognize seals not present in the training set.In scenari-os with limited annotated data,the combination of data augmentation techniques and the utilization of the ViT model can facilitate accurate seal image recognition.

关键词

印章识别/深度学习/数据增强/数字人文

Key words

seal recognition/deep learning/data augmentation/digital humanities

引用本文复制引用

基金项目

国家社会科学基金"加快构建中国特色哲学社会科学学科体系、学术体系、话语体系"研究专项(19VXK09)

出版年

2024

情报学报

中国科学技术情报学会　中国科学技术信息研究所

情报学报

CSTPCDCSSCICSCDCHSSCD北大核心

影响因子：1.296

ISSN：1000-0135

参考文献量54

段落导航