首页|基于知识图谱增强的领域多模态实体识别

基于知识图谱增强的领域多模态实体识别

扫码查看
针对特定领域中文命名实体识别存在的局限性,提出一种利用学科图谱和图像提高实体识别准确率的模型,旨在利用领域图谱和图像提高计算机学科领域短文本中实体识别的准确率。使用基于BERT-BiLSTM-Attention的模型提取文本特征,使用ResNet152提取图像特征,并使用分词工具获得句子中的名词实体。通过BERT将名词实体与图谱节点进行特征嵌入,利用余弦相似度查找句子中的分词在学科图谱中最相似的节点,保留到该节点距离为1的邻居节点,生成最佳匹配子图,作为句子的语义补充。使用多层感知机(MLP)将文本、图像和子图3种特征映射到同一空间,并通过独特的门控机制实现文本和图像的细粒度跨模态特征融合。最后,通过交叉注意力机制将多模态特征与子图特征进行融合,输入解码器进行实体标记。在Twitter2015、Twitter2017和自建计算机学科数据集上同基线模型进行实验比较,结果显示,所提方法在领域数据集上的精确率、召回率和F1值分别可达88。56%、87。47%和88。01%,与最优基线模型相比,F1值提高了 1。36个百分点,表明利用领域知识图谱能有效提升实体识别效果。
Enhanced Domain Multi-modal Entity Recognition Based on Knowledge Graph
Addressing the limitations of Chinese Named Entity Recognition(NER)within specific domains,this paper proposes a model to enhance entity recognition accuracy by utilizing domain-specific Knowledge Graphs(KGs)and images.The proposed model leverages domain graphs and images to improve entity recognition accuracy in short texts related to computer science.The model employs a Bidirectional Encoder Representations from Transformers(BERT)-Bidirectional Long Short-Term Memory(BiLSTM)-Attention-based model to extract textual features,a ResNet152-based approach to extract image features,and a word segmentation tool to obtain noun entities from sentences.These noun entities are then embedded with KG nodes using BERT.The model uses cosine similarity to determine the most similar nodes in the KG for the segmented words in the sentence.It retains neighboring nodes with a distance of 1 from this node to generate an optimal matching subgraph for semantic enrichment of the sentence.A Multi-Layer Perceptron(MLP)is employed to map the textual,image,and subgraph features into the same space.A unique gating mechanism is utilized to achieve fine-grained cross-modal feature fusion between textual and image features.Finally,multimodal features are fused with subgraph features by using a cross-attention mechanism and are then fed into the decoder for entity labeling.Experimental comparisons with relevant baseline models conducted on Twitter2015,Twitter2017,and a self-constructed computer science dataset are presented.The results indicate that the proposed approach achieved precision,recall,and F1 value of 88.56%,87.47%,and 88.01%on the domain dataset compared to the optimal baseline model,its F1 value increased by 1.36 percentage points,demonstrating the effectiveness of incorporating domain KGs for entity recognition.

Named Entity Recognition(NER)multi-modaldomainKnowledge Graph(KG)cross-modal feature fusionattention mechanism

李华昱、张智康、闫阳、岳阳

展开 >

中国石油大学(华东)计算机科学与技术学院,山东青岛 266580

命名实体识别 多模态 领域 知识图谱 跨模态特征融合 注意力机制

山东省自然科学基金面上项目中国石油大学(华东)研究生创新基金

ZR2020MF14022CX04035A

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(8)
  • 10