Review on the progress of the AIGC visual content generation and traceability
In the contemporary digital era,which is characterized by rapid technological advancements,multimedia con-tent creation,particularly in visual content generation,has become an integral part of modern societal development.The exponential growth of digital media and the creative industry has attracted attention to artificial intelligence generated con-tent(AIGC)technology.The groundbreaking applications of AIGC in visual content generation not only have equipped multimedia creators with novel tools and capabilities but also have delivered substantial benefits across diverse domains,which span from the realms of cinema and gaming to the immersive landscapes of virtual reality.This review comprehensive introduces the profound advancements within AIGC technology.Our particular emphasis is on the domain of visual content generation and its critical facet of traceability.Initially,our discussions trace the evolutionary path of image generation technology,from its inception within generative adversarial networks(GANs)to the latest advancements in Transformer auto-regressive models and diffusion probability models.This progression unveils a remarkable leap in the quality and capa-bility of image generation,which underscores the rapid evolution of this field.This evolution has transitioned from its nascent stages to an era characterized by explosive growth.First,we delve into the development of GANs,encompassing their evolution from text-conditioned methods to sophisticated techniques for style control and the development of large-scale models.This type of technology pioneered the text-to-image generation.GANs can further improve their performance by expanding network parameters and dataset size due to their strong scalability.Furthermore,we explore the emergence of Transformer-based auto-regressive models,such as DALL·E and CogView,which have heralded a new epoch in the domain of image generation.The basic strategy of autoregressive models is to first use the Transformer structure to predict the feature sequence of images based on other conditional feature sequences such as text and sketches.Then,it uses a spe-cially trained decoding network to decode these feature sequences into a complete image.They can generate realistic images based on the large-scale parameters.In addition,our discourse delves into the burgeoning interest surrounding dif-fusion probability models,which are renowned for their stable training methods and their ability to yield high-quality out-puts.The diffusion models first adopt an iterative and random process to simulate the gradual transformation of the observed data into a known noise distribution.Then,they reconstruct the original data in the opposite direction from the noise distri-bution.This random process based on stochastic approach provides a more stable training process,while it also demon-strates impressive results in terms of generated quality and diversity.As the development of AIGC technology continues to advance,it encounters challenges,such as the enhancement in content quality and the need of precise control to align with specific requisites.Within this context,this review conducts a thorough exploration of controllable image generation tech-nology,which is a pivotal research domain that strives to furnish meticulous control over the generated content.This achievement is facilitated through the integration of supplementary elements,such as intricate layouts,detailed sketches,and precise visual references.This approach empowers creators to preserve their artistic autonomy while upholding exact-ing standards of quality.One notable facet that has garnered considerable academic attention is the utilization of visual ref-erences as a mechanism to enable the generation of diverse styles and personalized outcomes by incorporating user-provided visual elements.This review underscores the profound potential inherent in these methodologies,which illustrates their transformative role across domains such as digital art and interactive media.The development of these technologies intro-duces new horizons in digital creativity.However,it presents profound challenges,particularly in the domain of image authenticity and the potential for malevolent misuse.These risks are exemplified by the creation of deep fakes or the prolif-eration of fake news.These challenges extend far beyond mere technical intricacies;they encompass substantial risks per-taining to individual privacy,security,and the broader societal implications of eroding public trust and social stability.In response to these formidable challenges,watermark-related image traceability technology has emerged as an indispensable solution.This technology harnesses the power of watermarking techniques to authenticate and verify AI-generated images,which safeguards their integrity.Within the pages of this review,we meticulously categorize these watermarking techniques into distinct types:watermark-free embedding,watermark pre-embedding,watermark post-embedding,and joint genera-tion methods.First,we introduce the watermark-free embedding methods,which treat the generated traces left during model generation as fingerprints.The inherent fingerprint information is used to achieve model attribution of generated images and achieve traceability purposes.Second,the watermark pre-embedding methods aim to embed the watermark into input training data such as noise and image.Another aim is to use the embedded watermark data to train the generation model,which can also introduce traceability information in the generated image.Third,the watermark post-embedding methods divide the process of generating watermark images into two stages:image generation and watermark embedding.Watermark embedding is performed after image generation.Finally,the joint generation methods aim to achieve adaptive embedding of watermark information during the image generation process,minimize damage to the image generation pro-cess when fusing with image features,and ultimately generate images carrying watermarks.Each of these approaches plays a pivotal role in the verification of traceability across diverse scenarios,which offers a robust defense against potential mis-uses of AI-generated imagery.In conclusion,while AIGC technology offers promising new opportunities in visual content creation,it simultaneously causes significant challenges regarding the security and integrity of generated content.This com-prehensive review covers the breadth of AIGC technology,which starts from an overview of existing image generation tech-nologies,such as GANs,auto-regressive models,and diffusion probability models.It then categorizes and analyzes con-trollable image generation technology from the perspectives of additional conditions and visual examples.In addition,the review focuses on watermark-related image traceability technology,discusses various watermark embedding techniques and the current state of watermark attacks on generated images,and provides an extensive overview and future outlook of gen-eration image traceability technology.The aim is to offer researchers a detailed,systematic,and comprehensive perspec-tive on the advancements in AIGC visual content generation and traceability.This study deepens the understanding of cur-rent research trends,challenges,and future directions in this rapidly evolving field.
artificial intelligence generated content(AIGC)visual content generationcontrollable image generationsecurity of generated contenttraceability of generated images