基于图神经网络的专利文本分类研究

扫码查看

原文链接

万方数据
维普

中文摘要：传统专利分类由专家逐件审阅,随着大数据、人工智能和自然语言处理技术的快速发展,专利文本自动分类正在成为学界、业界的重要研究方向之一.文本分类技术可以用于判断专利申请是否获得授权,帮助审查员自动化处理和分析专利申请文件,从而提高工作效率.针对海量专利的英文文本,提出一种基于图神经网络模型的专利文本自动分类方法,用于测度专利申请是否可获得授权.使用深度学习算法TextGCN对专利摘要语料进行学习和训练,利用图结构数据的邻居信息和节点特征,通过神经网络产生专利文本的表示向量,进而实现专利授权与否的预测.实验结果表明,本文采用的深度学习算法能够得到较好的分类效果,并且与Doc2vec和TFIDF表示方法相比,该模型在精确度、召回率、准确率及F1方面均有所提高,可为专利授权与否的自动预测提供可靠的研究依据.

外文标题：Research on Patent Text Classification Based on Graph Neural Network

外文摘要：Traditional patent classification is carried out manually by experts.With the development of big data,artificial intelligence and natural language processing technology,automatic classification of patents is becoming one of the important research directions in both academia and industry.The text classification technology can be applied to determine whether a patent application can be granted,aiding in the automation of processing and analyzing a large number of patent documents,thereby improving work efficiency.This paper focuses on the English texts from a vast number of patents,and proposes an automatic patent text classification method based on the graph neural network model,which is used to assess whether patent applications can obtain authorization.This article utilizes the deep learning algorithm TextGCN to learn on patents'abstracts,leveraging the neighbor information and node features of graph-structured data.Through the neural network,it generates representation vectors for patents,facilitating the forecasting of patent authorization results.The experimental findings demonstrate that the deep learning approach applied in this study achieves commendable classification results.Compared to the Doc2vec and TFIDF methods,the TextGCN model shows improvements in terms of precision,recall,accuracy,and F1 score.This method can offer a dependable research foundation for automatic prediction of the grant status of patents.

外文关键词：

patent classificationgraph convolutional networkDoc2vecTFIDFrepresent learning

作者：

魏雯婕、张更平

展开 >

作者单位：

同济大学图书馆,上海 200092

关键词：

专利分类图卷积神经网络 Doc2vec TFIDF 表示学习

基金：

上海市科技情报学会战略性新兴产业情报专项(2023)

项目编号：

出版年：

2024

竞争情报

CSTPCD

ISSN：

年,卷(期)：2024.20(2)

参考文献量32