基于大语言模型的标准文献分类研究

Research on Standard Literature Classification Based on Large Language Model

刘春卉 ¹高志春 ²张辉 ³黄振远³

扫码查看

作者信息

1. 中国标准化研究院
2. 山西省市场监督管理局
3. 北京航空航天大学
折叠

摘要

在当今大数据时代,随着标准等文献呈现爆炸性增长,文献的高效管理与服务面临着巨大挑战.由于产业的不断演进和多样化,传统的标准分类体系无法灵活适应不断变化的产业需求,导致标准分类与实际产业之间的鸿沟日益加深.尤其在信息时代,该问题显著突显,而传统标准分类的转型升级困难.因此,解决标准分类与产业匹配难题成为提升文献管理效能和服务质量的重要一环.在这一背景下,本文提出一种创新性方法,旨在弥合标准分类与产业之间的差距,提高产业分类的准确性,从而更好地满足不断发展的产业需求.同时,该方法注重解决在中文产业分类领域所面临的多语义、多类别和少标注数据等复杂问题.

Abstract

In today's era of big data,the explosive growth of standards and other literature poses significant challenges for the efficient management and services of documents.Due to the continuous evolution and diversification of industries,traditional standard classification systems struggle to adapt flexibly to the ever-changing demands of industries,resulting in a gap between standard classification and actual industrial need.In the information age,this problem is notably emphasized,and the transformation and upgrading of traditional standard classifications become challenging.Therefore,addressing the issue of standard classifications being difficult to align with industrial needs has become a crucial aspect of enhancing the efficiency and quality of document management and services.In this context,the innovative approach proposed in this paper aims to bridge the gap between standard classification and industries,enhancing the accuracy of industrial classification to better meet the evolving demands of industries.Simultaneously,this method focuses on addressing the complex challenges faced in the field of Chinese industrial classification,including issues such as multiple semantics,multiple categories,and limited annotated data.

关键词

大语言模型/语义表征/文献/标准

Key words

large language models/semantic representation/literature/standard

引用本文复制引用

出版年

2024

标准科学

中国标准化研究院中国标准化协会

标准科学

影响因子：0.32

ISSN：1674-5698

段落导航