CLGLF:置信学习引导标签融合的多模态命名实体识别方法

扫码查看

原文链接

万方数据
维普

中文摘要：为解决多模态命名实体识别中存在的视觉语义理解和多模态语义的偏差问题,本文提出了置信学习引导标签融合的多模态命名实体识别方法.该方法调用BLIP-2预训练模型生成图像描述,将其与输入的文本拼接,进行图文联合编码实现多模态特征融合,对多模态表征和文本表征解码后得到候选标签和文本标签;在采用KL散度损失函数对齐两组标签的基础上,计算置信分数用来评估多模态表征质量,设置置信阈值辅助筛选出有偏差的候选标签,并使用相应位置的文本标签替换有偏差的候选标签,实现标签的融合,最终完成多模态命名实体识别.为了验证本文方法,在Twitter-2015和Twitter-2017多模态数据集上进行实验,并将实验结果与MSB、UMT等7种主流方法进行对比,实验结果证明了本文方法的有效性.

外文标题：CLGLF:Confidence Learning Guides Label Fusion for Multimodal Named Entity Recognition Method

外文摘要：To solve the visual semantic understanding bias and multimodal semantic bias in multimodal named entity recognition,the confidence learning guides label fusion (CLGLF) method for multimodal named entity recognition is pro-posed. This method invokes the BLIP-2 pre-trained model to generate image captions,concatenates them with the input texts,and performs joint coding to achieve multimodal feature fusion. The candidate labels and text labels are obtained after decoding the multimodal representations and text representations. Based on using the KL divergence loss function to align the two groups of labels,the confidence score is calculated to evaluate the quality of the multimodal representation,and a confidence threshold is set to help screen out the biased candidate labels,the text labels in the corresponding positions are used to replace the biased candidate labels,to achieve the label fusion,and finally complete the multimodal named entity recognition. In order to verify the proposed method,experiments are carried out on the Twitter-2015 and Twitter-2017 mul-timodal datasets,and the experimental results are compared with 7 mainstream methods,such as MSB and UMT. The exper-imental results show the effectiveness of the CLGLF.

外文关键词：

multimodal named entity recognitionimage captionconfidence learningmultimodal semantic biasinformation extraction

作者：

王海荣、王彤、徐玺、荆博祥、陈芳萍

展开 >

作者单位：

北方民族大学计算机科学与工程学院,宁夏银川 750021

北方民族大学图像图形智能处理国家民委重点实验室,宁夏银川 750021

关键词：

多模态命名实体识别图像描述置信学习多模态语义偏差信息抽取

基金：

宁夏自然科学基金北方民族大学研究生创新项目

项目编号：

2023AAC03316YCX23159

出版年：

2024

DOI：

10.12263/DZXB.20231160

电子学报

中国电子学会

电子学报

CSTPCD北大核心

影响因子：1.237

ISSN：0372-2112

年,卷(期)：2024.52(7)