Low-light optical flow estimation with hidden feature supervision using a Siamese network
Objective Optical flow estimation has been widely used in target tracking,video time-domain super-resolution,behavior recognition,scene depth estimation,and other vision applications.However,imaging under low-light conditions can hardly avoid low signal-to-noise ratio and motion blur,making low-light optical flow estimation very challenging.Applying a pre-stage low-light image enhancement can effectively improve the image visual perception,but it may not be helpful for further optical flow estimation.Unlike the"low light enhancement first and optical flow estimation next"strat-egy,the low-light image enhancement should be considered with the optical flow estimation simultaneously to prevent the loss of scene motion information.The optical flow features are encoded into the latent space,which enables supervised fea-ture learning for paired low-light and normal-light datasets.This paper also reveals the post-task-oriented feature enhance-ment outperforms the general visual enhancement of low-light images.The main contributions of this paper can be summa-rized as follows:1)A dual-branch Siamese network framework is proposed for low-light optical flow estimation.A weight-sharing block is used to establish the correlation of motion features between low-light images and normal-light images.2)An iterative low-light flow estimation module,which can be supervised using normal-light hidden features,is proposed.Our solution is free of explicit enhancement of low-light images.Method This paper proposes a dual-branch Siamese net-work to encode low-light and the normal-light optical flow features.Then,the encoded features are used to estimate the optical flow in a supervised manner.Our dual-branch feature extractor is constructed using a weight-sharing block,which encodes the motion features.Importantly,our algorithm does not need a pre-stage low-light enhancement,which is usually employed in most existing optical flow estimations.To overcome the high spatial-temporal computational complexity,this paper proposes to compute the K-nearest neighbor correlation volume instead of the 4D all-pair correlation volume.To fuse local and global motion features better,an attention mechanism for the 2D motion feature aggregation is introduced.After the feature extraction,a discriminator is used to distinguish the low-light image features from the normal-light image fea-tures.The feature extractor training is completed when the discriminator is incapable to recognize the two.To avoid the explicit enhancement of low-light images,the final optical flow estimation module is composed of a feature enhancement block and a gated recurrent unit(GRU).In an iterative way,the optical flow is decoded from the enhanced feature in the block.A latent feature supervised loss and an iterative similarity loss are used to keep the convergence of the training stage.In the experiment part,the network is trained on an NVIDIA GeForce RTX 3080Ti GPU.The input images are uni-formly cropped to 496 × 368 pixels in spatial resolution.Because the low-light and normal-light image paired datasets are limited,the flying chairs dark noise(FCDN)and the various brightness optical flow(VBOF)datasets are jointly used for the model training.Result The proposed algorithm is compared with three state-of-the-art optical flow estimation models on several low-light datasets and normal-light datasets,including FCDN,VBOF,Sintel,and KITTI datasets.Besides the visual comparison,quantitative evaluation with the end-point-error(EPE)metric is conducted.Experimental results show the proposed method achieves a performance comparable with the best available optical flow estimation under normal illumi-nation conditions.The proposed solution improves up to 0.16 in terms of the EPE index on the FCDN dataset compared with the second-best solution under the low-light condition.On the VBOF dataset,the proposed solution improves 0.08 in terms of the EPE index compared with the second-best algorithm.Visual comparisons with all the compared methods are also provided.The results show the proposed model preserves more accurate details than other optical flow estimations,especially under low-light conditions.Conclusion In this paper,a dual-branch Siamese network is proposed for realizing the accurate encoding of the optical flow features under normal-light and low-light conditions.The feature extractor is con-structed with a weight-sharing block,which enables better-supervised learning for low-light optical flow estimation.The pro-posed model has remarkable advantages in accuracy and generalizability for the flow estimation.The experimental results indicate the proposed supervised low-light flow estimation outperforms the state-of-the-art solutions in terms of precision.