Advances of low-level vision reconstruction in raw domain
The low-level vision reconstruction technology aims to reconstruct high-quality images and videos under limited imaging conditions,which is important for subsequent visual analysis.The images(videos)in the raw domain have two advantageous features:wider bit depth(10,12,14 bits)and intensity linear to the irradiance.As a result,raw images contain the most original information and the noise statistics are also simpler than those in standard RGB(sRGB)domain.Therefore,low-level vision reconstruction with raw inputs has achieved an increasing attention from academic and indus-trial communities in recent years.This review focuses on the low-level vision reconstruction technology in the raw domain and mainly investigates the progress of deep learning-based vision reconstruction.Six representative vision reconstruction tasks in raw domain are selected,namely,low-light enhancement and denoising,super-resolution,high dynamic range(HDR)reconstruction,moiré removal,multi-task joint reconstruction,and raw image generation,for a comprehensive review.Representative methods in the six fields are systematically summarized,the advantages and problems of various methods are outlined,and the advantages and unique attributes of raw images(videos)compared with sRGB images(vid-eos)are highlighted in different tasks.Thereafter,the currently open-source low-level vision reconstruction datasets in raw domain in various fields are summarized,including image,burst image,and video datasets.The dataset construction meth-ods for each task are introduced.Different strategies to solve the key problems in dataset construction,namely,spatial alignment and temporal alignment,are also introduced.We hope these summarization and comparisons can provide refer-ences for the followers who construct their own datasets.This review would like to point out that the six tasks not only have unique problems but also have common issues.For example,for denoising and enhancement of videos captured in low light,constructing a supervised dataset with realistic motions and fine-scale textures is still difficult.For multi-frame super-resolution,the key problem is constructing the accurate alignment module.For HDR reconstruction,the deghosting perfor-mance still needs to be improved in dark and over-exposed areas.For demoiréing,balancing the performance between color recovery and moiré removal needs to be explored.For multi-task joint reconstruction,improving the adjustability and interpretability of the model is a key problem.Meanwhile,all the six tasks need to recover the correct colors while complet-ing their own tasks.However,they have different optimization directions.Introducing special modules to ensure their simi-lar optimization directions may be a good solution.In addition,achieving accurate alignment between degraded and ground truth images is difficult,and many datasets exhibit misalignment.Then,we review representative industrial applications of raw domain reconstruction,including intelligent image signal processing and night imaging in smartphones,low-light and HDR imaging in security monitoring cameras,and raw domain detection in driverless cars.Finally,based on the existing problems and challenges of raw domain vision reconstruction,we identify four possible development trends for raw domain vision reconstruction.First is exploiting the specific properties of raw images(videos)for a specific task.Current methods usually utilize the advantages of raw data in terms of wider bit depth and linearity to intensity.Only a few works utilize the specialized structures of raw data.For example,the moiré distribution in different channels differs,and the green channel usually has higher intensities than other channels.We expect more works exploring the special properties of raw data in popular denoising and super-resolution tasks.Second is improving the availability of large-scale raw data.Many cameras do not provide the raw outputs due to the large memory cost.Therefore,the current constructed raw domain datasets are usually smaller than those in sRGB domain.A feasible solution is to design the raw image compression method with sRGB image guidance for enabling raw domain decoding with a few meta data.Third is alleviating the data-bias problem.The model trained with the raw data captured with one camera may not work well when dealing with raw images captured with other cameras.Alleviating the data-bias is important for real applications.One feasible solution is to jointly utilize physics-and data-driven models.Fourth is further improving raw reconstruction performance with large models.The scale of data is important to improve the reconstruction quality.One solution is to first train a large model with a large-scale dataset and then distill the large model to a small one.Then,the small model can be deployed in various edge devices.In summary,we expect more works exploring low-level vision reconstruction in raw domain to improve the imaging quality of various vision systems.
raw-domain vision reconstructionlow light image(video)enhancement in raw domainraw image(video)denoisingraw image(video)super-resolutionraw image(video)high dynamic range(HDR)reconstructionraw image(video)demoriéing