The method of obstacle extraction about remote sensing image based on V-Transformer
Obstacles in remote sensing images are the most important bases for the variable geometry of observa-tion systems in seismic exploration.The traditional manual obstacle extraction methods are inefficient and sus-ceptible to human factors,and difficult to ensure result consistency,making them unsuitable for complex sur-face environments and large numbers of obstacles.Current generalized methods for automatic obstacle extrac-tion with convolutional neural networks are limited by the size of convolution kernels,unable to directly per-form semantic interactions over long distances,and fail to accurately extract obstacles with large spans that are partially occluded(country roads,rivers,etc.).Therefore,this study proposes a V-shaped fully self-attention network(MTNet)to extract obstacles from remote sensing images.Firstly,MTNet adopts an end-to-end V-shaped encoder-decoder structure to realize information interaction through skip connections;Secondly,the tra-ditional convolutional layer is replaced by the Mix-Transformer block with long-range modeling capability to ex-tract and reconstruct more accurate multi-scale features of the obstacle;Finally,the transposed convolution is replaced by the light-weight block extending layer for upsampling and image segmentation to reconstruct the ob-stacle information.Experimental results show that the network significantly outperforms existing methods in terms of accuracy and speed in segmenting obstacles,especially in road recognition.
variable geometry of observation systemdeep learningobstacle extractionimage semantic segmen-tationMix-Transformer