Cross-CNN:An Animation Cross-Frame Sketch Colorization Algorithm Based on Hybrid Model with CNN and Transformer
Coloring long sequences of animated sketch frames is a challenging task in computer vision. On one hand,the information contained in sketches is sparse,and coloring algorithms need to infer missing information. On the other hand,the colors between consecutive frames need to be consistent to ensure visual quality throughout the video. Most exist-ing coloring algorithms are designed for single images and only provide one open-ended,reasonable color result,which is not suitable for coloring frame sequences. Other reference-based coloring algorithms do not have an organic connection be-tween two frames,resulting in unsatisfactory coloring results. In the same shot sequence,the features of same object usually do not change too much. Therefore,a model that can automatically color sketches based on a given reference frame can be designed. This paper proposes a new model called Cross-CNN that combines convolutional neural networks (CNN) and Transformer. Our Cross-CNN can find and match colors from the reference frame,thus ensuring temporal feature consisten-cy. In this model,the reference frame and the sketch frame are superimposed in the channel dimension,and the pre-trained Resnet50 network is used to extract locally fused features. The fused feature map is then passed to the Transformer structure for encoding to extract global features. In the Transformer structure,a cross attention mechanism is designed to better match long-distance features. Finally,a convolutional decoder with skip connections is used to output the colored image. In terms of the dataset,this paper extracted frames from eight movies and conducted strict screening to create a dataset containing 20000 pairs of reference and sketch frames for experimental research. The SSIM (Structural SIMilarity) of Cross-CNN can reach 0.932,which is higher than the SOTA algorithm by 0.014. The algorithm codes link for this paper:https://github.com/silenye/Cross-CNN.
sketch coloringconvolutional neural networkTransformercolor matchinganimation production