计算机工程与科学2024,Vol.46Issue(5) :929-936.DOI:10.3969/j.issn.1007-130X.2024.05.018

结合上下文的细粒度实体分类特征表示方法

A context-aware feature representation method in fine-grained entity typing

刘盼 郭延明 雷军 王昊冉 老松杨 李国辉
计算机工程与科学2024,Vol.46Issue(5) :929-936.DOI:10.3969/j.issn.1007-130X.2024.05.018

结合上下文的细粒度实体分类特征表示方法

A context-aware feature representation method in fine-grained entity typing

刘盼 1郭延明 1雷军 1王昊冉 1老松杨 1李国辉1
扫码查看

作者信息

  • 1. 国防科技大学系统工程学院,湖南 长沙 410073
  • 折叠

摘要

细粒度实体分类任务赋予文本中的实体以细粒度类别,能够通过类别信息为实体提供丰富的语义信息,在关系抽取、实体链接和问答系统等下游任务中发挥重要作用.由于实体在句子中的长度和位置是不统一的,无法直接计算实体在上下文中的表示,现有的细粒度实体分类方法将实体提及与其上下文分别进行处理和特征表示,割裂了实体与其上下文之间的语义关联.提出一种结合上下文的实体分类特征表示方法,将实体放回上下文,并解决了实体长度和位置不统一的情况下,实体特征表示的计算问题.实验结果表明,采用结合上下文的实体特征表示方法提取实体在上下文中的特征表示,能够大幅提升细粒度实体分类的性能,该方法在中文细粒度实体分类数据集CFET上的Macro-F1 较原文普遍提高了 10%以上.

Abstract

Fine-grained entity typing assigns fine-grained types to entities in the text,which can pro-vide entities with rich semantic information through type information,and plays important roles in downstream tasks such as relation extraction,entity linking,and question answering systems.Since the length and position of entities in sentences are not uniform,the representation of entities in context can not be calculated.Existing fine-grained entity typing models process entity mentions and their contexts separately into individual feature representations,which separates the semantic relationship between them.This paper proposes a context-aware feature representation method in fine-grained entity typing,which places entities back into their contexts and solves the problem of computing entity feature repre-sentation when the entity length and position are not uniform.Experimental results demonstrate that this method can extract the feature representation of entities in their contexts,and significantly improve the performance of fine-grained entity typing.The Macro-F1 value of this method on the Chinese fine-grained entity classification dataset CFET is improved by more than 10%.

关键词

细粒度实体分类/上下文/特征表示

Key words

fine-grained entity typing/context/feature representation

引用本文复制引用

基金项目

国家重点研发计划(2020AAA0108800)

出版年

2024
计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
参考文献量17
段落导航相关论文