基于注意力改进RTformer的滑坡遥感图像语义分割

Semantic segmentation of landslide remote sensing image based on improved attention RTformer

唐海林 ¹张俊 ¹李屹旭 ²李升海¹

扫码查看

作者信息

1. 贵州大学矿业学院贵阳 550025
2. 贵州大学农学院贵阳 550025
折叠

摘要

针对现有的遥感影像滑坡语义分割网络存在模型参数量大、训练速度较慢,滑坡边界区域识别模糊、遥感影像多尺度语义信息分类差异化等问题,本文提出一种改进的RTformer轻量级语义分割模型,在模型不同层级模块间嵌入空洞卷积注意力ASPP模块和通道注意力SE模块,以捕捉不同尺度的语义信息和通过计算通道关系从而增强特征表示能力,提高模型特征提取能力,使其更加适用于滑坡遥感影像识别任务.利用Cityscapes数据集针对模型中空洞卷积的膨胀率设置和不同批量大小进行对比试验以得到最优解,以毕节滑坡灾害数据集做为预训练数据集设计一个自监督训练任务,并使用其进行模型微调并检验模型针对滑坡灾害遥感影像的分割性能.最终得到的模型在Cityscapes数据集和毕节市滑坡灾害数据集上均获得了最优表现,相比原始RTformer模型,两个数据集的平均交并比(mIOU)分别提升了2.26%和4.34%.并且与FCN、U-Net、DeeplabV3、SegFormer等经典语义分割模型相比,改进模型以最少的参数和最快的推理速度实现了识别任务,并达到了最优分割效果.

Abstract

Aiming at the existing problems of landslide semantic segmentation network of remote sensing image,such as large number of model parameters,slow training speed,fuzzy recognition of landslide boundary region,and differentiation of multi-scale semantic information classification of remote sensing image,this paper proposes an improved lightweight semantic segmentation model of RTformer.The cavity convolution attention ASPP module and channel attention SE module were embedded among the modules at different levels of the model to capture semantic information at different scales and to enhance the feature representation ability and improve the feature extraction ability of the model,making it more suitable for landslide remote sensing image recognition.Cityscapes data set was used to conduct comparative experiments on the expansion rate setting of the cavity convolution in the model and different batch sizes to obtain the optimal solution.A self-supervised training task was designed using the Bijie landslide disaster data set as the pre-training data set,and the model was fine-tuned and the segmentation performance of the model against the landslide disaster remote sensing images was tested.The resulting model achieved the best performance on both Cityscapes dataset and Bijie landslide disaster dataset.Compared with the original RTformer model,the mean crossover ratio(mIOU)of the two datasets increased by 2.26%and 4.34%,respectively.Compared with the classical semantic segmentation models such as FCN,U-Net,DeeplabV3 and SegFormer,the improved model realizes the recognition task with the fewest parameters and the fastest reasoning speed,and achieves the optimal segmentation effect.

关键词

图像处理/滑坡检测/遥感图像/语义分割/注意力机制

Key words

image processing/landslide detection/remote sensing image/semantic segmentation/attention mechanism

引用本文复制引用

出版年

2024

电子测量技术

北京无线电技术研究所

电子测量技术

CSTPCD北大核心

影响因子：1.166

ISSN：1002-7300

段落导航