Different from the natural images captured from real-world scenes,screen content images (SCI) are syn-thetic images typically composed of various multimedia contents,such as computer-generated text,graphics,and anima-tions. Existing SCI quality assessment methods usually fail to fully consider the impacts of image edge and global context on the perceived quality of screen content images. To address the above issues,this paper proposed a no-reference screen content image quality assessment model based on edge assistance and multi-scale Transformer. Firstly,an edge structure map consisting of the high-frequency information in a distorted SCI is constructed using Gaussian Laplace operators. Then a convolutional neural network (CNN) is used to extract and fuse the multi-scale features from the input distorted SCI and the corresponding edge structure map,thus providing additional edge information gain for model training. In addition,this paper further proposed a multi-scale feature encoding module based on Transformer to better model the global context infor-mation of different scale images and edge features on the basis of the local features obtained by CNN. The experimental re-sults show that the model proposed in this paper outperforms the state-of-the-art no-reference and full-reference SCI quality assessment methods,and achieves higher consistency with the subjective visual perception.
关键词
无参考屏幕内容图像质量评估/高斯拉普拉斯算子/卷积神经网络/Transformer/多尺度特征
Key words
no-reference screen content image quality assessment/laplacian of gaussian/convolutional neural network/Transformer/multi-scale features