Visual Pattern Representation and Coding Based on Deep Generative Models
The performance of early intelligent encoding methods was limited by manually designed solutions,while current neural network-based encoding methods lack interpretability,which hinders subsequent analysis and interaction between humans and machine vision.In-spired by generative models,the generative encoding methods aim to achieve compression and synthesis of images and videos by con-structing efficient generative models,obtaining interpretable compact visual representations,and synthesizing high-quality visual content that conforms to the prior distribution of images.Among them,conceptual image encoding and conceptual video encoding leverage the pow-erful sample generation capability and compact hierarchical visual representation models of generative models,resulting in superior encod-ing performance for images and videos.Cross-modal semantic coding,on the other hand,enables cross-modal transformation and coding between the image and text domains while maintaining interpretability,achieving ultra-high compression ratios of thousands of times and satisfactory reconstruction results.
intelligent video encodinggenerative encodingcross-modal compressionconceptual coding