智能系统学报2024,Vol.19Issue(1) :106-113.DOI:10.11992/tis.202309008

基于监督对比学习的小样本甲骨文字识别

Few-shot oracle bone character recognition based on supervised contrastive learning

毕晓君 毛亚菲
智能系统学报2024,Vol.19Issue(1) :106-113.DOI:10.11992/tis.202309008

基于监督对比学习的小样本甲骨文字识别

Few-shot oracle bone character recognition based on supervised contrastive learning

毕晓君 1毛亚菲2
扫码查看

作者信息

  • 1. 民族语言智能分析与安全治理教育部重点实验室, 北京 100081;中央民族大学 信息工程学院, 北京 100081
  • 2. 哈尔滨工程大学 信息与通信工程学院, 黑龙江 哈尔滨 150001
  • 折叠

摘要

针对由于甲骨文中部分字符的出现频率较低,直接利用深度神经网络进行识别会产生严重的过拟合现象,进而导致识别精度较差的问题,本文提出一种基于监督对比学习的小样本甲骨文字识别方法.选用利用增强样本的Y型(ensemble augmented-shot Y-shaped,EASY)学习框架作为网络的主干部分,通过集合数据增强、多骨干网络集成、特征向量投影等训练策略,直接实现利用少量带标签样本进行识别;引入监督对比学习,并提出联合对比损失,使得特征空间中类内特征向量距离更近,类间特征向量距离更远,进一步提高模型性能.实验结果表明:相比于当前效果最好的Orc-Bert模型,提出的小样本甲骨文识别模型在 1-shot任务中的准确率提升了 26.42%,3-shot任务的准确率提升了 28.55%,5-shot任务的准确率提升了 23.98%,较好解决了低频率出现的甲骨文字识别精度较差的问题.

Abstract

Due to low frequency of occurrence of some characters in Oracle,directly using the deep neural network for recognition will produce serious overfitting,which will lead to poor recognition accuracy.To this end,this paper pro-poses a few-shot oracle bone character recognition method based on supervised contrastive learning.The ensemble aug-mented-shot Y-shaped(EASY)learning framework is selected as the backbone part of the network.Through training techniques such as collective data enhancement,multi-backbone network integration,and feature vector projection,etc.,it is possible to directly use a small number of labeled samples for identification.And then,introducing the supervised contrastive learning and the concept of a joint contrastive loss to make the intra-class feature vectors in the feature space closer and the inter-class feature vectors further apart,thereby the model performance is improved further.The experi-mental results show that compared with the current best-performing Orc-Bert model,the accuracy of the few-shot or-acle recognition model proposed in this paper has increased by 26.42%in the 1-shot task,28.55%in the 3-shot task,and 23.98%in the 5-shot task,which better solves the problem of poor recognition accuracy of low-frequency oracle bone characters.

关键词

甲骨文字识别/小样本/监督对比学习/利用增强样本的Y型学习框架/深度学习/特征空间/联合对比损失

Key words

oracle bone character recognition/few-shot/supervised contrastive learning/EASY framework/deep learn-ing/feature space/joint contrastive loss

引用本文复制引用

基金项目

国家自然科学基金重点项目(62236011)

国家社会科学基金重大项目(20&ZD279)

出版年

2024
智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
参考文献量25
段落导航相关论文