Deep learning-based foveated rendering in 3D space:a review
The widespread adoption of virtual reality(VR)and augmented reality technologies across various sectors,including healthcare,education,military,and entertainment,has propelled head-mounted displays with high resolution and wide fields of view into the forefront of display devices.However,attaining a satisfactory level of immersion and inter-activity poses a primary challenge in the realm of VR,with latency potentially leading to user discomfort in the form of dizzi-ness and nausea.Multiple studies have underscored the necessity of achieving a highly realistic VR experience while main-taining user comfort,entailing the elevation of the screen's image refresh rate to 1 800 Hz and keeping latency below 3~40 ms.Achieving real-time,photorealistic rendering at high resolution and low latency represents a formidable objective.Foveated rendering is an effective approach to address these issues by adjusting the rendering quality across the image based on gaze position,maintaining high quality in the fovea area while reducing quality in the periphery.This technique leads to substantial computational savings and improved rendering speed without a perceptible loss in visual quality.While previous reviews have examined technical approaches to foveated rendering,they focused more on categorizing the imple-mentation techniques.A comprehensive review within the domain of machine learning still needs to be explored.With the ongoing advancements in machine learning within the rendering field,combining machine learning and foveated rendering is considered a promising research area,especially in postprocessing,where machine learning methods have great poten-tial.Nonmachine learning methods inevitably introduce artifacts.By contrast,machine learning methods have a wide range of applications in the postprocessing domain of rendering to optimize and improve foveated rendering results and enhance the realism and immersion of foveated images in a manner unattainable through nonmachine learning approaches.Therefore,this work presents a comprehensive overview of foveated rendering from a machine-learning perspective.In this paper,we first provide an overview of the background knowledge of human visual perception,including aspects of the human visual system,contrast sensitivity functions,visual acuity models,and visual crowding.Subsequently,this paper briefly describes the most representative nonmachine learning methods for point-of-attention rendering,including adaptive resolution,geometric simplification,shading simplification,and hardware implementation,and summarizes these meth-ods'features,advantages,and disadvantages.Additionally,we describe the criteria employed for method evaluation in this review,including evaluation metrics for foveated images and gaze-point prediction.Next,we subdivide machine learn-ing methods into super-resolution,denoise,image reconstruction,image synthesis,gaze prediction,and image applica-tion.We provide a detailed summary of them in terms of four aspects:results quality,network speed,user experience,and the ability to handle objects.Among them,super-resolution methods commonly use more neural blocks in the foveal region while fewer neural blocks in the periphery region,resulting in variable regional super-resolution quality.Similarly,foveated denoising usually performs fine denoising in the fovea and coarse denoising in the peripheral,but the denoising aspect has yet to receive extensive attention.The initial attempt to integrate image reconstruction with gaze utilized genera-tive adversarial networks(GANs),yielding promising outcomes.Then,some researchers combined direct prediction and kernel prediction for image reconstruction,which is also the state of the art in this field.Gaze prediction is a key develop-ment direction for future VR rendering,which is mostly combined with saliency detection to predict the location of the view-point.Substantial work remains in the field,but unfortunately,only a tiny portion of the work can be achieved in real time.Finally,we present the current problems and challenges machine learning methods face.Our review of machine learning approaches in foveated rendering not only elucidates the research prospects and developmental direction but also provides insights for future researchers in choosing research direction and designing network architectures.