As a kind of imp oft ant technology document, the patent is of substantial significance to the national in%llectualwroperty strategy in China. Existing patent corpus are mostly for the purpose. of information retrieval and machine translation task, leaving the fine-grained annotated patent less touched. To facilitate the forth-coming intelligent patent technology development, this paper constructs a Patent Key Information Corpus, consisting of 313 patents annotated with the issues, methods and effects in the texts. Then the SOTA named entity recognition models are applied to the corpus, and the sharping decrease in the performance indicate the automatic identification of the key information in a patent is a challenging IE task.
专利;语料库;关键信息
张文婷、赵美含、马翊轩、王文瑞、刘宇哲、杨沐昀
展开 >
哈尔滨工业大学计算学部,黑龙江哈尔滨150001
哈尔滨市阳光惠远知识产权代理有限公司,黑龙江哈尔滨150000
专利;语料库;关键信息
Chinese national conference on computational linguistic
Nanchang(CN)
The 21st Chinese national conference on computational linguistic