Weakly supervised semantic segmentation is an extremely challenging task that is supervised only with im-age-level labels.Most of the existing methods multiply the last convolutional layer feature map of the backbone model with the classifier weights to get the most discriminative region.However,just getting the most salient features is not enough for the semantic segmentation task.In this paper,following the fact that fused multi-layer features have richer semantic infor-mation,the feature maps of different layers are fused and the fused feature maps are used for target object localisation.Migrating this approach to an end-to-end network framework,the semantic segmentation performance is significantly im-proved.The end-to-end network in this paper achieves 62.5%results on the validation set of Pascal VOC2012 dataset.
weakly suprvised semantic segmentationend to endCAM