图书情报工作2024,Vol.68Issue(18) :112-122.DOI:10.13266/j.issn.0252-3116.2024.18.011

基于多模态特征融合的相似专利识别方法研究

Research on Similar Patent Identification Based on Multimodal Feature Fusion

谢小东 吴洁 盛永祥 王建刚 周潇
图书情报工作2024,Vol.68Issue(18) :112-122.DOI:10.13266/j.issn.0252-3116.2024.18.011

基于多模态特征融合的相似专利识别方法研究

Research on Similar Patent Identification Based on Multimodal Feature Fusion

谢小东 1吴洁 1盛永祥 1王建刚 1周潇1
扫码查看

作者信息

  • 1. 江苏科技大学经济管理学院 镇江 212003
  • 折叠

摘要

[目的/意义]专利数量攀升的同时给专利检索工作带来了巨大的挑战,如何利用先进的计算机技术进行相似专利识别成为亟待解决的问题.[方法/过程]提出一种基于多模态特征融合的相似专利识别方法,通过BERT-wwm模型和ResNet-50模型提取专利文本模态特征和图像模态特征,结合自注意力机制和交叉注意力机制有效利用两种模态内部特征信息以及模态间的交互信息,在此基础上通过模型训练与优化进行相似专利识别.[结果/结论]采用IPC为"C08F10/00"领域数据进行实证,本文模型准确率达到80.03%,召回率达到82.01%,优于基线模型效果.进行相似专利识别模拟实验,本文模型召回率达到88.89%,实际应用效果较为优异.文本模态特征和图像模态特征结合可以有效提高相似专利识别准确率和效率,本文方法有助于提高专利检索效率,加快专利审查过程,辅助专利预警分析,加强知识产权的保护.

Abstract

[Purpose/Significance]The burgeoning number of patents poses significant challenges to patent retrieval,highlighting the urgent need for advanced computational techniques to identify similar patents.[Method/Process]This paper proposed a multimodal feature fusion method for similar patent identification.It utilized the BERT-wwm model and the ResNet-50 model to extract textual and image features of patents,respectively.By inte-grating self-attention and cross-attention mechanisms,the method effectively harnessed intra-modal feature infor-mation and inter-modal interaction information.Based on these,the model was trained and optimized for the similar patent identification.[Result/Conclusion]Empirical tests using IPC category"C08F10/00"data demonstrate that the model achieves an accuracy of 80.03%and a recall rate of 82.01%,outperforming baseline models.In simula-tions of similar patent identification,the model reaches a recall rate of 88.89%,indicating superior practical perfor-mance.The fusion of textual and image modal features significantly enhances the accuracy and efficiency of similar patent identification.This approach facilitates improved patent retrieval efficiency,accelerates the patent examina-tion process,aids in patent alert analysis,and strengthens intellectual property protection.

关键词

多模态/文本特征/图像特征/自注意力/交叉注意力/特征融合/相似专利识别

Key words

multimodal/textual information/image information/self-attention/cross-attention/feature fusion similar/patent identification

引用本文复制引用

基金项目

国家自然科学基金面上项目(72171122)

江苏省研究生科研与实践创新计划项目(KYCX23_3817)

出版年

2024
图书情报工作
中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心
影响因子:2.203
ISSN:0252-3116
段落导航相关论文