模式识别与人工智能2024,Vol.37Issue(8) :715-728.DOI:10.16451/j.cnki.issn1003-6059.202408005

故事启发大语言模型的时序知识图谱预测

Narrative-Driven Large Language Model for Temporal Knowledge Graph Prediction

陈娟 赵新潮 隋京言 祁麟 田辰 庞亮 方金云
模式识别与人工智能2024,Vol.37Issue(8) :715-728.DOI:10.16451/j.cnki.issn1003-6059.202408005

故事启发大语言模型的时序知识图谱预测

Narrative-Driven Large Language Model for Temporal Knowledge Graph Prediction

陈娟 1赵新潮 2隋京言 1祁麟 3田辰 3庞亮 1方金云4
扫码查看

作者信息

  • 1. 中国科学院计算技术研究所前瞻研究实验室 北京 100190;中国科学院大学计算机科学与技术学院 北京 100049
  • 2. 中科大数据研究院兰亭中心 郑州 450046
  • 3. 北京交通大学软件学院 北京 100044
  • 4. 中国科学院计算技术研究所前瞻研究实验室 北京 100190
  • 折叠

摘要

时序知识图谱海量稀疏,实体的长尾分布导致对分布外实体的推理泛化性较差,历史交互低频导致对未来事件的预测偏差较大.为此,文中提出故事启发大语言模型的时序知识图谱预测方法,利用大语言模型的世界知识储备和复杂语义推理能力,增强对分布外实体的理解和交互稀疏事件的关联.首先,根据时序知识图谱中时间和结构的特性筛选"关键事件树",通过历史事件筛选策略提炼最具代表性的事件,并摘要当前查询相关的历史信息,减少数据输入量并保留最重要的信息.然后,微调大语言模型生成器,生成时序语义关联且符合逻辑的"关键事件树"叙事故事,作为非结构化输入.在生成过程中,特别关注事件之间的因果关系和时间顺序,确保生成的故事具有连贯性和合理性.最后,利用大语言模型推理器推理缺失的时序实体.在3个公开数据集上的实验表明,文中方法可充分发挥大模型的能力,完成精准的时序实体推理.

Abstract

The temporal knowledge graph(TKG)is characterized by vast sparsity,and the long-tail distribution of entities leads to poor generalization in reasoning for out-of-distribution entities.Additionally,the low infrequency of historical interactions results in biased predictions for future events.Therefore,a narrative-driven large language model for TKG Prediction is proposed.The world knowledge and complex semantic reasoning capabilities of large language models are leveraged to enhance the understanding of out-of-distribution entities and the association of sparse interaction events.Firstly,a key event tree is selected based on the temporal and structural characteristics of TKG,and the most representative events are extracted through a historical event filtering strategy.Relevant historical information is summarized to reduce input data while the most important information is retained.Then,the large language model generator is fine-tuned to produce logically coherent"key event tree"narratives as unstructured input.During the generation process,special attention is paid to the causal relationships and temporal sequences of events to ensure the coherence and rationality of the generated stories.Finally,the large language model is utilized as a reasoner to infer the missing temporal entities.Experiments on three public datasets demonstrate that the proposed method effectively leverages the capabilities of large models to achieve more accurate temporal entity reasoning.

关键词

时序知识图谱(TKG)/大语言模型/关键事件树/时序故事/事件推理

Key words

Temporal Knowledge Graph(TKG)/Large Language Model/Key Event Tree/Temporal Story/Event Inference

引用本文复制引用

出版年

2024
模式识别与人工智能
中国自动化学会,国家智能计算机研究开发中心,中国科学院合肥智能机械研究所

模式识别与人工智能

CSTPCDCSCD北大核心
影响因子:0.954
ISSN:1003-6059
段落导航相关论文