首页|基于知识注入提示学习的专利短语相似度计算

基于知识注入提示学习的专利短语相似度计算

扫码查看
专利是授予发明者在一定时期内保护其发明的法定权利,在当今的社会活动中发挥着重要作用。然而现有研究并未针对专利相似度数据进行适配优化,导致其应用在专利短语相似度匹配任务中效果不佳。已有研究表明,在低资源的场景下,提示学习将文本片段(模板)作为输入,将分类问题转换为掩码语言建模问题,其关键的一步是在标签空间和标签词空间之间构造一个投影。提出一种基于知识注入的提示学习方法,将其应用于专利短语相似度匹配计算任务。为解决专利短语信息不足的问题,利用专利短语中的相似度标签信息,使用知识增强专利短语与标签信息。首先通过实体链接技术建立专利短语与外部知识的关联关系;然后设计一种基于实体影响度的邻域信息过滤机制,用于缓解专利短语信息不足的问题;最后考虑不同外部知识对专利短语相似度计算的影响,设计应用于专利短语的多种增强提示文本。实验结果表明,该方法的Pearson相关系数(PCC)和Spearman相关系数(SRC)相较次优对比方法分别提升6。8%和5。7%。
Similarity Computation of Patent Phrases Based on Knowledge Injection Prompt Learning
A patent is a legal right conferred to inventors to protect their inventions for a limited time,and it plays a crucial role in present-day social activities.Existing research has not optimized the adaptation of patent similarity data,which has negatively affected matching patent phrase similarity.Previous research has shown that in low-resource scenarios,prompt learning uses text fragments(i.e.,templates)as input,transforming the classification problem into a mask language modeling problem;here,a key step is to construct a projection between the label space and label word space.This study presents a knowledge-based prompt learning method and applies it to the similarity matching of patent phrases.To solve the problem of insufficient information related to patent phrases,this study uses similarity label information in patent phrases and knowledge to enhance the patent phrases and label information.This study first establishes the relationship between patent phrases and external knowledge using entity-linking technology.The study then designs a neighborhood information filtering mechanism based on the degree of entity influence to expand the problem of insufficient patent phrase information.Finally,based on the effects of different types of external knowledge on the similarity calculation of patent phrases,the study generates a variety of enhanced prompt text applied to patent phrases.Experimental results show that the Pearson Correlation Coefficient(PCC)and Spearman Rank Correlation(SRC)of the proposed method are increased by 6.8%and 5.7%,respectively,as compared with the suboptimal method.

patent phrasesimilarity computationknowledge injectionprompt learningprompt text

邓远飞、李加伟、蒋运承

展开 >

华南师范大学计算机学院,广东广州 510631

华南师范大学人工智能学院,广东佛山 528225

专利短语 相似度计算 知识注入 提示学习 提示文本

国家自然科学基金国家自然科学基金

61772210U1911201

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(4)
  • 36