首页|基于双重聚合和自合并网络的小样本图像语义分割

基于双重聚合和自合并网络的小样本图像语义分割

扫码查看
小样本图像语义分割是一种非常具有挑战性的任务,它试图使用几个带标签的样本来分割新类对象.主流方法常会存在特征鉴别性不高和原型偏差等问题.为缓解这些问题,本文提出一种基于双重聚合和自合并网络的小样本图像语义分割方法,能够充分挖掘特征相似性并减小原型偏差.首先,提出一个特征-掩码双重聚合模块,在支持特征和查询特征之间构建覆盖所有空间位置的密集相似关系,为特征聚合和掩码聚合提供全局语义信息.具体来说,通过对特征相似矩阵进行特征和掩码双重聚合,可以为查询图像获取具有引导信息的增强特征和初始掩码.然后,提出自合并解码器,通过合并基于初始掩码的自原型和已知的支持原型来减小原型偏差,并通过融合增强特征与合并原型向解码器传递丰富的类别语义信息.最后,利用基类预测信息进一步优化来自解码器的预测结果.本文方法在数据集PASCAL-5i上的mIoU在1-shot和5-shot情况下分别取得了68.3%和71.5%,在数据集COCO-20i上的mIoU在1-shot和5-shot情况下分别取得了46.5%和51.4%,优于主流方法的分割性能,能够更准确地分割出新类的目标区域.
Bi-aggregation and self-merging network for few-shot image semantic segmentation
Few-shot image semantic segmentation is a very challenging task that attempts to segment objects of new classes using only a few labeled samples.The mainstream methods often have problems of low discriminative feature and prototype deviation.To alleviate these problems,a new few-shot image semantic segmentation method based on a bi-aggregation and self-merging network is proposed,which can fully mine the similarity of features and reduce prototype bias.Firstly,we propose a feature-mask bi-aggregation module to provide global semantic information for the feature aggregation and mask aggregation by constructing a dense similarity relation between the support features and the query features covering all spatial locations.Specifically,an enhanced feature and an initial mask with guiding information can be obtained for the query image by performing feature and mask bi-aggregation on the similarity matrices.Then,a self-merging decoder is proposed,which reduces the prototype bias by adding the initial mask-based self-prototype with the known support prototypes,and conveys rich category semantic information to the decoder by fusing the merged prototype with the enhancement feature.Finally,the prediction results obtained by the decoder are further optimized by the prediction results of the base classes.The mIoU values of our method on the dataset PASCAL-5i achieve 68.3%and 71.5%in the 1-shot and 5-shot cases,respectively,and on the dataset COCO-20i achieve 46.5%and 51.4%in the 1-shot and 5-shot cases,respectively,which is superior to the segmentation performance of the mainstream methods,and can segment the target region of the new class more accurately.

few-shot semantic segmentationsimilarity of featuresbi-aggregationintra-class diversityself-merging

刘玉、于明、朱叶

展开 >

河北工业大学 电子信息工程学院,天津 300401

河北工业大学 人工智能与数据科学学院,天津 300401

小样本图像语义分割 特征相似性 双重聚合 类内差异性 自合并

国家自然科学基金青年项目河北省自然科学基金

62102129F2021202030

2024

液晶与显示
中科院长春光学精密机械与物理研究所 中国光学光电子行业协会液晶分会 中国物理学会液晶分会

液晶与显示

CSTPCD北大核心
影响因子:0.964
ISSN:1007-2780
年,卷(期):2024.39(10)
  • 3