MSPRL:multiscale progressively residual learning network for image inverse halftoning
Objective The halftoning method represents continuous-tone images by using two levels of color,namely,black and white;it is commonly used in digital image printing,publishing,and displaying applications because of cost consider-ations.Compared with continuous-tone images,a halftone image has only two values.The halftoning method can save con-siderable storage space and network transfer bandwidth,so it is a feasible and important image compression method.Image inverse halftoning is a classic image restoration task,aiming to recover continuous-tone images from halftone images with only bilevel pixels.However,owing to the loss of original image content in halftone images,inverse halftoning is also a classic ill-problem.Although existing inverse halftoning algorithms have achieved good performance,their reconstruction results indicate lost image details and features,causing varying degrees of curvature and roughness in some high-frequency regions and resulting in poor visual reconstruction results,which still cannot meet the requirements for high detail and tex-ture of images.Therefore,inverse halftoning remains a challenge in recovering high-quality continuous-tone images.Many previous methods focused on model design to improve performance,ignoring the important impact of training strategies on model optimization,which led to poor model performance.To solve these problems,we propose an inverse halftone net-work to improve the quality of halftone image reconstruction and explore different training strategies to optimize model train-ing.Method In this paper,we propose an end-to-end multiscale progressively residual learning network(MSPRL),which is based on the UNet architecture and takes multiscale input images.To make full use of different input image information,we design a shallow feature extraction module to capture the attention features of different-scale images.We divide our model into an encoder and a decoder,where the encoder focuses on restoring content information,and the decoder receives the aggregation features of the encoder to strengthen deep feature learning.The encoder and the decoder are composed of residual blocks(RBs).We design our MSPRL to comprise three levels,each level receiving the input halftone images of different scales.To collect the encoder features and transmit them to the decoder,we use the Concat operation and a 1 × 1 convolutional kernel as the feature fusion module(FF)to aggregate the feature maps of different-level encoders.In our overall model,input halftone images are progressively learned from the left encoder to the right decoder.We systematically study the effects of different training strategies for model training and reconstruction performance.For example,the perfor-mance of using 128 × 128 pixel patch size is slightly lower than that of using 256 × 256 pixels patch size,but its training speed is significantly reduced by about 65%during the model training phase.Adding fast Fourier transform loss can further improve the model performance compared with the use of a single L,loss.We also compare different feature channel dimen-sions,feature extraction blocks,and activation functions.Experimental results demonstrate that effective learning strate-gies can optimize model training and significantly improve performance.Result The experimental results are compared with the results of six methods on seven datasets,including a denoising convolutional neural network,VDSR,an enhanced deep super-resolution network,a progressively residual learning network(PRL),a gradient-guided residual learning net-work,a multi-input multi-output UNet,and a retrained PRL(PRL-dt).On the Places365 and Kodak datasets,compared with that of the second-best-performing model PRL-dt,the peak signal-to-noise ratio(PSNR)of our MSPRL is increased by 0.12 dB and 0.18 dB,respectively.On the other five commonly used test datasets(Set5,Set14,BSD100,Urban100,and Manga109)for image super-resolution,compared with that of the second-best-performing model PRL-dt,the PSNR of MSPRL is increased by 0.11 dB,0.25 dB,0.08 dB,0.39 dB and 0.35 dB,respectively.Based on our training strate-gies,PRL-dt has an average PSNR improvement of 1.44 dB compared with the unoptimized training PRL on the seven test datasets.Extensive experiments demonstrate that MSPRL achieves significant reconstruction results in image details and textures.Conclusion In this paper,we propose an inverse halftone network to solve the problem of low-quality reconstruc-tion for inverse halftoning.Our MSPRL contains an SFE,an FF,and an encoder and a decoder with RBs as the core.It combines the advantages of the UNet architecture and multiscale image information and chooses appropriate training strate-gies to improve image reconstruction quality and the visual effects in terms of details and textures.Extensive experiments demonstrate that our MSPRL outperforms previous approaches and achieves state-of-the-art performance.