基于多模态特征融合的相似专利识别方法研究

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：[目的/意义]专利数量攀升的同时给专利检索工作带来了巨大的挑战,如何利用先进的计算机技术进行相似专利识别成为亟待解决的问题.[方法/过程]提出一种基于多模态特征融合的相似专利识别方法,通过BERT-wwm模型和ResNet-50模型提取专利文本模态特征和图像模态特征,结合自注意力机制和交叉注意力机制有效利用两种模态内部特征信息以及模态间的交互信息,在此基础上通过模型训练与优化进行相似专利识别.[结果/结论]采用IPC为"C08F10/00"领域数据进行实证,本文模型准确率达到80.03％,召回率达到82.01％,优于基线模型效果.进行相似专利识别模拟实验,本文模型召回率达到88.89％,实际应用效果较为优异.文本模态特征和图像模态特征结合可以有效提高相似专利识别准确率和效率,本文方法有助于提高专利检索效率,加快专利审查过程,辅助专利预警分析,加强知识产权的保护.

外文标题：Research on Similar Patent Identification Based on Multimodal Feature Fusion

外文摘要：[Purpose/Significance]The burgeoning number of patents poses significant challenges to patent retrieval,highlighting the urgent need for advanced computational techniques to identify similar patents.[Method/Process]This paper proposed a multimodal feature fusion method for similar patent identification.It utilized the BERT-wwm model and the ResNet-50 model to extract textual and image features of patents,respectively.By inte-grating self-attention and cross-attention mechanisms,the method effectively harnessed intra-modal feature infor-mation and inter-modal interaction information.Based on these,the model was trained and optimized for the similar patent identification.[Result/Conclusion]Empirical tests using IPC category"C08F10/00"data demonstrate that the model achieves an accuracy of 80.03％and a recall rate of 82.01％,outperforming baseline models.In simula-tions of similar patent identification,the model reaches a recall rate of 88.89％,indicating superior practical perfor-mance.The fusion of textual and image modal features significantly enhances the accuracy and efficiency of similar patent identification.This approach facilitates improved patent retrieval efficiency,accelerates the patent examina-tion process,aids in patent alert analysis,and strengthens intellectual property protection.

外文关键词：

multimodaltextual informationimage informationself-attentioncross-attentionfeature fusion similarpatent identification

作者：

谢小东、吴洁、盛永祥、王建刚、周潇

展开 >

作者单位：

江苏科技大学经济管理学院镇江 212003

关键词：

多模态文本特征图像特征自注意力交叉注意力特征融合相似专利识别

基金：

国家自然科学基金面上项目江苏省研究生科研与实践创新计划项目

项目编号：

72171122KYCX23_3817

出版年：

2024

DOI：

10.13266/j.issn.0252-3116.2024.18.011

图书情报工作

中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心

影响因子：2.203

ISSN：0252-3116

年,卷(期)：2024.68(18)