融合多尺度通道注意力的开放词汇语义分割模型SAN
An Open Vocabulary Semantic Segmentation Model SAN Integrating Multi Scale Channel Attention
武玲 1张虹1
作者信息
- 1. 太原师范学院,山西 晋中 030619
- 折叠
摘要
随着视觉语言模型的发展,开放词汇方法在识别带注释的标签空间之外的类别方面具有广泛应用.相比于弱监督和零样本方法,开放词汇方法被证明更加通用和有效.文章研究的目标是改进面向开放词汇分割的轻量化模型SAN,即引入基于多尺度通道注意力的特征融合机制AFF来改进该模型,并改进原始SAN结构中的双分支特征融合方法.然后在多个语义分割基准上评估了该改进算法,结果显示在几乎不改变参数量的情况下,模型表现有所提升.这一改进方案有助于简化未来开放词汇语义分割的研究.
Abstract
With the development of visual language models,open vocabulary methods have been widely used in identifying categories outside the annotated label.Compared with the weakly supervised and zero sample method,the open vocabulary method is proved to be more versatile and effective.The goal of this study is to improve the lightweight model SAN for open vocabulary segmentation,which introduces a feature fusion mechanism AFF based on multi scale channel attention to improve the model,and improve the dual branch feature fusion method in the original SAN structure.Then,the improved algorithm is evaluated based on multiple semantic segmentation benchmarks,and the results show that the model performance has certain improvement with almost no change in the number of parameters.This improvement plan will help simplify future research on open vocabulary semantic segmentation.
关键词
开放词汇/语义分割/SAN/CLIP/多尺度通道注意力Key words
open vocabulary/semantic segmentation/SAN/CLIP/multi scale channel attention引用本文复制引用
基金项目
太原师范学院研究生教育教学改革研究课题(SYYJSJG-2154)
出版年
2024