CCI-ClipCap:A Chinese Ceramic Image Description Model Based on Prompt Paradigm
[Objective]This study aims to construct a Chinese Ceramic Image Description Model(CCI-ClipCap)to provide technical support for ceramic culture research and digital preservation.[Methods]Based on ClipCap,the prompt paradigm is introduced to improve the model's understanding of cross-modal data,enabling automatic description of ceramic images.Additionally,we proposed a text similarity evaluation method tailored for structured textual representation.[Results]The CCI-ClipCap model improved the multi-modal fusion process with the prompt paradigm,effectively extracting information from ceramic images and generating accurate textual descriptions.Compared to baseline models,the Bleu and Rouge values increased by 0.04 and 0.14,respectively.[Limitations]The data used originated from the British Museum collections,not native Chinese datasets.This single-source data may affect the model's performance.[Conclusions]The CCI-ClipCap model generates text with rich levels of expression,demonstrating a soild understanding of ceramic knowledge and exhibiting high professionalism.
Digital HumanitiesImage CaptioningMultimodal LearningClipCapPrompt