Research on Patent Retrieval Strategy Based on ARSER Deep Matching Model
Patent matching is a crucial component of patent retrieval tasks,with its core focus on discovering other patent documents similar to the queried patent rapidly and accurately.Patent retrieval is pervasive throughout various stages of the patent life cycle and serves as the foundation for nearly all patent analysis tasks.In previous patent retrieval research,patent matching primarily relied on short texts such as titles and abstracts,or employed methods like TF-IDF and word2vec for shallow semantic text matching based on patent specifications.To fully leverage the deep semantic information contained in patent specifications,this paper proposes the ARSER model for deep semantic text matching.In ARSER model,patent specifications are first segmented into sentences.These sentences are then represented as vectors using BERT,with zero vectors used for padding to generate consistent vector representations for different patent specifications.The attention mechanism is then applied to obtain the best-matching vector for each sentence vector from one patent specification within another.The matching results between these vectors are calculated as local matches,which are subsequently fused to obtain the final matching result.Using the dataset provided by Conference and Labs of the Evaluation Forum-Intellectual Property(CLEF-IP),the effectiveness of the ARSER model in identifying prior art documents in patent retrieval is evaluated.This is compared with traditional methods such as TF-IDF,LDA,or neural network language models like word2vec,which are commonly used in previous research on improving patent retrieval strategies.The experimental results demonstrate that the proposed ARSER model outperforms other benchmark methods in terms of retrieval performance for patent matching tasks.Specifically,compared to the best-performing Doc2Vec model,ARSER achieves a 3.97%improvement in recall rate.
patent matchingpatent retrieval strategiespatent specificationretrieval performanceARSER modelBERT model