首页|基于掩码自动编码器的图像修复研究

基于掩码自动编码器的图像修复研究

扫码查看
掩码图像建模(MIM)因为在视觉表示方面具有巨大潜力而备受关注。现有的使用简单像素重构损失的MIM方法生成质量不高,输出模糊,针对这个不足,提出了基于掩码自动编码器的图像生成和自监督表示学习框架。对掩码图像建模研究的关键点是,该模型在输入和输出时使用VQGAN学习到的语义标记,并将其与掩码相结合,增加对比损失函数和噪声损失函数,以实现生成和表示学习的双重目标。首先使用对比损失函数来塑造图像样本的嵌入空间,以促进有意义的表示学习。同时,利用噪声损失函数来鼓励模型重建图像中的高频成分,从而提高生成能力。这种综合的方法使得掩码自动编码器成为一个强大且高效的模型,同时具备生成高质量图像和学习有用的图像表示的能力。
Research on Image Restoration Based on Mask Autoencoder
Masked Image Modeling(MIM)has received significant attention due to its tremendous potential in visual representation.Existing MIM methods that use simple pixel to reconstruct loss suffer from generating low-quality image,blurry outputs.To address this shortcoming,a framework for image generation and self-supervised representation learning based on mask autoencoder is proposed.The key point of research on modeling masked images is that the model uses semantic labels learned by VQGAN in both input and output,and combines them with masks to add contrast loss functions and noise loss functions to achieve the dual goals of generation and representation learning.Firstly,use the contrast loss function to shape the embedding space of image samples to promote meaningful representation learning.At the same time,using the noise loss function to encourage the model to reconstruct high-frequency components in the image,thereby improving the generation ability.This comprehensive approach makes mask autoencoder a powerful and efficient model,while also possessing the ability to generate high-quality images and learn useful image representations.

maskautoencoderVQGANrestoration effect

骆迪、张乾、柏武贰

展开 >

贵州民族大学 数据科学与信息工程学院,贵州 贵阳 550025

掩码 自动编码器 VQGAN 修复效果

2024

现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
年,卷(期):2024.8(3)
  • 11