像感域(Raw域)底层视觉重建技术进展

Advances of low-level vision reconstruction in raw domain

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：底层视觉重建技术旨在在受限的成像条件下重建高质量图像/视频,对后续视觉处理与呈现具有重要意义.由于像感域数据(raw data)具有高位宽、与感光量成线性响应等特点,近年来基于像感域的视觉重建技术在学术界和工业界获得的关注日益提高.本文聚焦于6种代表性视觉重建任务,包括低光增强与去噪、超分辨率、高动态范围重建、去摩尔纹、多任务联合重建以及数据生成,重点综述了深度学习驱动的像感域视觉重建领域的进展:系统地总结了领域代表性方法,概述各类方法的优势与局限,分析了不同任务中像感域数据相较于颜色域数据(经降噪、去马赛克、白平衡、色调映射和颜色空间转换(如RGB、sRGB等)等处理之后的数据)的独特属性与优势;梳理了各个领域的开源数据集,包括图像数据集、快速连拍数据集以及视频数据集,总结了数据集的构造方法以及配对数据的空间/时间对齐策略,为后续研究的数据集创建提供了参考与指引;总结了现有方法存在的问题与困境,展望了像感域底层视觉重建的发展趋势.

外文摘要：The low-level vision reconstruction technology aims to reconstruct high-quality images and videos under limited imaging conditions,which is important for subsequent visual analysis.The images(videos)in the raw domain have two advantageous features:wider bit depth(10,12,14 bits)and intensity linear to the irradiance.As a result,raw images contain the most original information and the noise statistics are also simpler than those in standard RGB(sRGB)domain.Therefore,low-level vision reconstruction with raw inputs has achieved an increasing attention from academic and indus-trial communities in recent years.This review focuses on the low-level vision reconstruction technology in the raw domain and mainly investigates the progress of deep learning-based vision reconstruction.Six representative vision reconstruction tasks in raw domain are selected,namely,low-light enhancement and denoising,super-resolution,high dynamic range(HDR)reconstruction,moiré removal,multi-task joint reconstruction,and raw image generation,for a comprehensive review.Representative methods in the six fields are systematically summarized,the advantages and problems of various methods are outlined,and the advantages and unique attributes of raw images(videos)compared with sRGB images(vid-eos)are highlighted in different tasks.Thereafter,the currently open-source low-level vision reconstruction datasets in raw domain in various fields are summarized,including image,burst image,and video datasets.The dataset construction meth-ods for each task are introduced.Different strategies to solve the key problems in dataset construction,namely,spatial alignment and temporal alignment,are also introduced.We hope these summarization and comparisons can provide refer-ences for the followers who construct their own datasets.This review would like to point out that the six tasks not only have unique problems but also have common issues.For example,for denoising and enhancement of videos captured in low light,constructing a supervised dataset with realistic motions and fine-scale textures is still difficult.For multi-frame super-resolution,the key problem is constructing the accurate alignment module.For HDR reconstruction,the deghosting perfor-mance still needs to be improved in dark and over-exposed areas.For demoiréing,balancing the performance between color recovery and moiré removal needs to be explored.For multi-task joint reconstruction,improving the adjustability and interpretability of the model is a key problem.Meanwhile,all the six tasks need to recover the correct colors while complet-ing their own tasks.However,they have different optimization directions.Introducing special modules to ensure their simi-lar optimization directions may be a good solution.In addition,achieving accurate alignment between degraded and ground truth images is difficult,and many datasets exhibit misalignment.Then,we review representative industrial applications of raw domain reconstruction,including intelligent image signal processing and night imaging in smartphones,low-light and HDR imaging in security monitoring cameras,and raw domain detection in driverless cars.Finally,based on the existing problems and challenges of raw domain vision reconstruction,we identify four possible development trends for raw domain vision reconstruction.First is exploiting the specific properties of raw images(videos)for a specific task.Current methods usually utilize the advantages of raw data in terms of wider bit depth and linearity to intensity.Only a few works utilize the specialized structures of raw data.For example,the moiré distribution in different channels differs,and the green channel usually has higher intensities than other channels.We expect more works exploring the special properties of raw data in popular denoising and super-resolution tasks.Second is improving the availability of large-scale raw data.Many cameras do not provide the raw outputs due to the large memory cost.Therefore,the current constructed raw domain datasets are usually smaller than those in sRGB domain.A feasible solution is to design the raw image compression method with sRGB image guidance for enabling raw domain decoding with a few meta data.Third is alleviating the data-bias problem.The model trained with the raw data captured with one camera may not work well when dealing with raw images captured with other cameras.Alleviating the data-bias is important for real applications.One feasible solution is to jointly utilize physics-and data-driven models.Fourth is further improving raw reconstruction performance with large models.The scale of data is important to improve the reconstruction quality.One solution is to first train a large model with a large-scale dataset and then distill the large model to a small one.Then,the small model can be deployed in various edge devices.In summary,we expect more works exploring low-level vision reconstruction in raw domain to improve the imaging quality of various vision systems.

外文关键词：

raw-domain vision reconstructionlow light image(video)enhancement in raw domainraw image(video)denoisingraw image(video)super-resolutionraw image(video)high dynamic range(HDR)reconstructionraw image(video)demoriéing

作者：

岳焕景、杨文瀚、李重仪、杨铀、刘文予、杨敬钰

展开 >

作者单位：

天津大学电气自动化与信息工程学院,天津 300072

鹏城实验室战略与交叉前沿研究部,深圳 518055

南开大学计算机学院,天津 300350

华中科技大学电子信息与通信学院,武汉 430074

展开 >

关键词：

像感域(Raw域)图像重建 Raw域图像(视频)低光增强 Raw域图像(视频)去噪 Raw域图像(视频)超分辨率 Raw域图像(视频)高动态范围重建 Raw域图像(视频)去摩尔纹

基金：

国家自然科学基金项目国家自然科学基金项目国家自然科学基金项目

项目编号：

620723316223101862376102

出版年：

2024

DOI：

10.11834/jig.230794

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(6)

参考文献量5