Research on Image Restoration Based on Mask Autoencoder
Masked Image Modeling(MIM)has received significant attention due to its tremendous potential in visual representation.Existing MIM methods that use simple pixel to reconstruct loss suffer from generating low-quality image,blurry outputs.To address this shortcoming,a framework for image generation and self-supervised representation learning based on mask autoencoder is proposed.The key point of research on modeling masked images is that the model uses semantic labels learned by VQGAN in both input and output,and combines them with masks to add contrast loss functions and noise loss functions to achieve the dual goals of generation and representation learning.Firstly,use the contrast loss function to shape the embedding space of image samples to promote meaningful representation learning.At the same time,using the noise loss function to encourage the model to reconstruct high-frequency components in the image,thereby improving the generation ability.This comprehensive approach makes mask autoencoder a powerful and efficient model,while also possessing the ability to generate high-quality images and learn useful image representations.