首页|面向检索结果集的结构化综述智能生成研究

面向检索结果集的结构化综述智能生成研究

扫码查看
[目的/意义]在学术文献检索与阅读场景下,当前学术信息量已远超用户信息处理能力,造成信息堆积.为应对用户阅读效率与知识吸收难题,面向学术文献检索结果集开展内容的综合挖掘揭示.[方法/过程]一方面从阅读体验出发,针对文献检索场景的特点,进行结构化综述表达设计;另一方面从技术方法与内容质量提升出发,利用基于深度学习的文本自动生成技术,构建科技文献数据集,训练并优化文本摘要模型,在此基础上利用大语言模型技术实现结构化的综述文本生成.[结果/结论]训练优化后的摘要模型在各指标的召回率和F1值上平均增长2.07%.基于大模型的结构化综述生成,在实际测评中能够有效地提炼、总结和归纳内容要点,验证本文技术路线和应用实践的可行性,为进一步提升学术文献的知识化服务水平、智能辅助阅读与语义内容综合挖掘揭示等方面提供应用实践指南.
Research on Intelligent Generation of Structured Review for Retrieval Result Set
[Purpose/Significance]In the academic document retrieval and reading,the current amount of academic information has far exceeded the user's information processing ability and caused information accumulation.In order to improve users'reading efficiency and knowledge absorption,this paper conducts comprehensive mining and revealing of the academic document retrieval result set.[Method/Process]On one hand,based on the reading experience and the document retrieval scenarios,it carried out a structured review expression design.On the other hand,starting from the improvement of technical methods and con-tent quality,it utilized deep learning based text automatic generation technology to construct an academic document dataset,trained and optimized a text abstract model,and used large language model technology to achieve structured review text generation.[Result/Conclusion]The optimized abstract model has an average increase of 2.07%in the recall rate and Fl value of each indicator after training.Structured review generation based on the big model can effectively extract and summarize the main points of the content in the actual evaluation,which verifies the feasibility of the technology roadmap and application practice,and provides a guide for the knowledge-based service level of academic literature,intelligent assisted reading and comprehensive mining and disclosure of semantic content.

literature searchstructured overviewlarge language modelautomatic text generation

孟旭阳、陈阳、白海燕

展开 >

中国科学技术信息研究所 北京 100038

文献检索 结构化综述 大语言模型 文本自动生成

国家重点研发计划中国科学技术信息研究所创新研究基金青年项目

2022YFF0711900QN2023-11

2024

图书情报工作
中国科学院文献情报中心

图书情报工作

CSTPCDCSSCICHSSCD北大核心
影响因子:2.203
ISSN:0252-3116
年,卷(期):2024.68(6)
  • 28