A coupled DeepLab and Transformer approach for fine classification of crop cultivation types in remote sensing
How to accurately monitor the planting of different types of complex farmland crops by remote sensing is the key to the realization of agricultural area survey and crop yield estimation in the area of smart rural agriculture.In the current pixel level semantic segmentation of crop planting in high-resolution images,the deep convolution neural network is difficult to take into account the spatial multi-scale global features and local details,which leads to problems such as blurring boundary con-tours between various farmland plots and low internal integrity of the same farmland area.In view of these shortcomings,this paper designs and proposes a dual branch parallel feature fusion network(FDTNet)that couples DeepLabv3+and Transformer encoders to achieve fine remote sensing monitoring of crop planting.Firstly,DeepLabv3+and Transformer are embedded in FDTNet in parallel to capture the local and global features of farmland image respectively.Secondly,the coupled attention fu-sion module(CAFM)is used to effectively fuse the characteristics of the two features.Then,in the decoder stage,the convo-lutional block attention module(CBAM)is applied to enhance the weight of the effective features of the convolutional layer.Fi-nally,the progressive multi-level feature fusion strategy is adopted to fully fuse the effective features in the encoder and deco-der,and output the feature map to achieve high-precision classification and recognition of late rice,middle rice,lotus root field,vegetable field and greenhouse.In order to verify the effectiveness of FDTNet network model in high-resolution crop classification application,this paper selects different high-resolution Yuhu dataset and Zhejiang dataset and experimental results of mIoU reach 74.7%and 81.4%,respectively.The mIoU of FDTNet can be 2.2%and 3.6%respectively higher than the existing deep learning methods,such as UNet,DeepLabv3,DeepLabv3+,ResT and Res-Swin.The results show that FDT-Net has better classification performance than the compared methods in two types of farmland scenes,which have single tex-ture and large sample size,or multiple texture and small sample size.The proposed FDTNet has a comprehensive ability to ex-tract effective features of multiple category crops.