首页|基于语义感知知识蒸馏的情景动词识别方法

基于语义感知知识蒸馏的情景动词识别方法

扫码查看
随着计算机视觉领域的迅速发展,动词情景识别作为图像领域一项挑战性任务,旨在识别图像中的语义高度复杂的情境.分析动词情景识别的研究现状,提出了一种结合CLIP模型与知识蒸馏技术的新方法.利用CLIP的强大跨模态能力捕获图像与情景间的细微关联,并通过知识蒸馏将这些关联映射到任务中,提升网络的性能.在标准SWiG数据集上结果表明,该算法在参数量最小的情况下,性能上超越了当前的先进技术.主要贡献为提出了一个结合CLIP的语义感知知识蒸馏的SKD-VSR框架,并在公开数据集上进行了广泛的实验,验证了方法的有效性.
Situational verb recognition with semantic-aware distillation
With the rapid development in the field of computer vision,Situational Verb Recognition stands as a challenging task in image processing,aimed at identifying semantically complex situations within images.A novel method that integrates the CLIP model with knowledge distillation techniques is proposed,which leverages the powerful cross-modal capabilities of CLIP to capture the subtle associations between images and verb,and employs knowledge distillation to map these associations onto the SVR task,thereby enhancing network performance.Experimental results on standard SWiG datasets indicate that the method sur-passes current state-of-the-art techniques in performance while having the smallest parameter count.The primary contribution is the introduction of a SVR framework that combines CLIP with semantic-aware knowledge distillation,validated through extensive experiments on public datasets,confirming the effectiveness of our method.

situational verb recognitionknowledge distillationneural networkscomputer vision

赵城斌

展开 >

西南交通大学计算机与人工智能学院,成都 611756

情景动词识别 知识蒸馏 神经网络 计算机视觉

2024

现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
年,卷(期):2024.30(17)