Optical Flow Estimation Method with Shifted Windows Transformer
A fusion optical flow estimation method combining shifted windows Transformer(SWin)and convolution is proposed to address the problems of motion blur,occlusion and large displacement,which leads to more accurate results on occluded areas.Firstly,original feature map is processed by SWin to get the enhanced features which include more self-similarities between pixels and makes up for the local character-istics of convolution features;Then,correlation volume is parsed by SWin to get more accurate flow incre-ment which include 2D motion feature parse and flow increment calculation;Finally,occlusion map is in-troduced to calculate the position embedding,which brings more pixel relationship to the calculation of at-tention.End point error on Sintel is 1.33;Average reference time on FlyingChairs is 69ms,4.2%lower than Global Motion Aggregation,which outperforms common optical flow estimation methods.
optical flow estimationself-attention mechanismshifted windows Transformerpositional encoding