中国山水画风格迁移的目标是在保持原有山水真实场景图像内容的前提下,引入传统中国画作特征,以生成具有中国山水画艺术特征的图像。近年,由于深度学习的快速发展,卷积神经网络(CNN)和对抗生成网络(GAN)几乎主导了包括风格迁移在内的大部分图像生成任务,但也存在一些问题,如真实场景在风格迁移过程中易丢失语义,GAN网络训练出现模型坍塌,CNN风格迁移方法出现棋盘效应等。视觉Transformer模型为图像处理任务提供了新的解决方案,但训练需大量数据且计算复杂。为了解决生成中国画过程中由上述因素引起的图像质量低及细节特征丢失等问题,本文提出一种能基于细节特征提取融合的中国山水画风格迁移网络,即SSTR(swin style transfer transformer)。该网络在StyTr2网络的基础上,引入了Swin-Transformer模型,利用视觉Trans-former的强语义性保留山水场景的特征;同时利用Swin-Transformer模型的分层体系结构及滑窗操作计算注意力机制,提取更多的山水画艺术风格细节,同时降低模型训练复杂度;最后,引入一个CNN解码器细化生成目标图像。本文利用公开视觉数据集COCO 2014与公开山水画数据集进行训练、验证与测试,并将结果与基线方法进行比较。结果表明,SSTR在处理中国山水画风格迁移任务中,风格损失和内容损失分别为1。35和1。88,在风格损失上优于StyTr2,表现出了优异的特征提取能力和图像生成能力。
A Style Transfer Method for Chinese Landscape Painting Based on Detail Feature Extraction and Fusion
The goal of Chinese painting image style transfer is to render a real landscape scene image with Chinese painting artistic features,guided by a style reference,while maintaining the original,realistic scene image content.Recently,due to the rapid development of deep learning,convolutional neural networks(CNNs)and adversarial generative networks(GANs)have almost dominated image generation tasks,including style transfer.However,several uncontrollable problems persist,such as the loss of some semantics during the style transfer process,model col-lapse in GAN network training,and the checkerboard effect in CNN-based style transfer methods.The visual transformer model provides a new solution for image processing tasks,but its training requires a large amount of data and involves significant computational complexity.A Chinese landscape painting style transfer network,SSTR(swin style transfer transformer),is proposed based on the fusion of detailed feature extraction to address these issues and generate high-quality Chinese paintings.This approach introduces the Swin-Transformer within the StyTr2 network framework and uses the visual transformer to preserve the features of landscapes.In addition,the layered architecture of the Swin-Transformer and the sliding window attention mechanism are utilized to extract finer details of the artistic features of landscape paintings while reducing the model's training complexity.Finally,a CNN decoder is incorporated after the Swin-Transformer decoder to refine the resulting image.The pub-lic visual dataset COCO and a public landscape painting dataset are employed for training,validation,and testing,with the results compared to several baseline methods.The experimental findings demonstrated that SSTR outperforms StyTr2 regarding style loss for the Chinese landscape painting style transfer task,showing superior feature extraction capabilities and image generation performance.