Earlier video camouflaged object detection methods often exploit motion cues by implicit motion modeling or directly inputting offline optical flow maps with noise,which affects model performance.To address the effective utilization of motion cues,an explicit motion modeling framework for video camouflaged object detection,called SMHNet,was proposed.First,an explicit motion modeling branch and a camouflaged object detection branch were jointly learned in the same framework.Then,the two branches were updated bidirectionally using a bidirectional feature updating module.The two branches performed mutual optimization and error correction to output optical flow estimation results and object detection maps.In addition,to address the lack of ground truth optical flow,a self-supervised strategy was adopted to supervise the explicit motion modeling branch.Comparison experiments on two benchmark datasets show that SMHNet effectively improves the performance of video camouflaged object detection.
Video camouflaged object detectionexplicit motion handlingoptical flowself-supervision