Semantic-Enhanced Zero-Shot Oracle Character Recognition
Oracle bone character recognition holds significant value for understanding Chinese history and the inheri-tance of Chinese culture.Currently,manual recognition of oracle bone character requires extensive expert experience and consumes a great deal of time,while the majority of methods for automatic recognition are constrained by the closed-set as-sumption.This limitation becomes pronounced in the context of oracle bones,where new characters are continuously dis-covered.To address this,some researchers achieved zero-shot oracle character recognition by visual matching.This method employs handprinted images as category references,achieving character recognition in scanned images through similarity matching with handprinted references.However,this approach overlooks the challenge of large intra-class variance in ora-cle bone scanned images,leading to potential mismatches due to the variability in glyphs.This paper proposes a two-stage semantic-enhanced zero-shot oracle character recognition method.The first stage is domain-independent character semantic learning,where the contrastive vision-language pre-training model CLIP is used to extract character semantics from oracle rubbings and template images through prompt learning,addressing the lack of semantic information in oracle characters.To cope with the domain differences between rubbings and templates,we set learnable domain-specific prompts and character category prompts,decoupling their semantics to achieve more accurate feature extraction.The second stage is semantic-en-hanced character image visual matching.The model extracts intra-class shared features and inter-class distinctive features through two branches.The first branch uses contrastive learning to align the visual features of different glyphs within the same character category to the character semantics,guiding the model to focus on intra-class shared features.The second branch employs the loss function N-Pair to enhance the model's ability to learn distinctive features between different charac-ter categories.During the testing phase,the model does not require semantic features;instead,it utilizes the intra-class simi-larity and inter-class distinctiveness learned during training to achieve more accurate matching between rubbings and tem-plates,improving zero-shot recognition performance.Experimental validation on the scanned images dataset OBC306 and the handprinted images dataset SOC5519 demonstrates that our proposed method surpasses the baseline method in zero-shot oracle character recognition accuracy by over 25%.
oracle character recognitionzero-shot recognitionvisual matchingsemantic-enhancedvision lan-guage modelcontrastive learning