Research on Generating Hanfu Effect Drawing Based on a Stable Diffusion Model
Aiming at the problem of confusion of dynasties in the image generation of Hanfu renderings due to the difficulty in accurately capturing the costume features of each dynasty,based on the Stable Diffusion model,the text and image feature space vectors are matched according to the newly input text prompt words,V*is used as the new marker symbol embedding layer,and the cross-attention layer parameters Wk and Wv are jointly optimized,ulti-mately minimizing the loss function of the model after learning new clothing text features.Through consulting the lit-erature and historical materials,163 text prompts related to clothing from the Tang,Song,and Ming dynasties were collected and organized.The generated Hanfu effect images demonstrate that the model can create garment images that correspond to the specific characteristics of each dynasty based on the text prompts words.Compared to three commonly used text-to-image generation algorithms that do not integrate Hanfu model features,the images generated by this method are clearer and of higher quality.In ablation experiments,the model employs the specific ID optimi-zation tagging symbol V*,which shows higher image alignment and lower text alignment compared to other meth-ods.In the experiments of Tang,Song and Ming dynasties,the mean values of KID and MMD are relatively low,which indicates that the proposed model has certain feasibility and effectiveness in optimizing the generation of Han-fu renderings.