首页|基于内在质量约束的文本生成和评价综述

基于内在质量约束的文本生成和评价综述

扫码查看
近年来,以ChatGPT为代表的能够适应复杂场景、并能满足人类的各种应用需求为目标的文本生成算法模型成为学术界与产业界共同关注的焦点.然而,ChatGPT等大规模语言模型(Large Language Model,LLM)高度忠实于用户意图的优势隐含了部分的事实性错误,而且也需要依靠提示内容来控制细致的生成质量和领域适应性,因此,研究以内在质量约束为核心的文本生成方法仍具有重要意义.本文在近年来关键的内容生成模型和技术对比研究的基础上,定义了基于内在质量约束的文本生成的基本形式,以及基于"信、达、雅"的6种质量特征;针对这6种质量特征,分析并总结了生成器模型的设计和相关算法;同时,围绕不同的内在质量特征总结了多种自动评价和人工评价指标与方法.最后,本文对文本内在质量约束技术的未来研究方向进行了展望.
A Survey of Text Generation and Evaluation Based on Intrinsic Quality Constraints
Recently,the outstanding text generation language models represented by ChatGPT,which can adapt to complex scenes and meet various application demands of human beings,has become the focuses of both the academic and industrial circles.However,the advantage of large language models(LLM)such as ChatGPT that are highly faithful to user intent implies some factual errors,and it is also necessary to rely on prompt content to control the detailed generation quality and domain adaptability,so it is still of great significance to study text generation with intrinsic quality constraints as the core.Based on the comparative study of key content generation models and technologies in recent years,this paper defined the basic form of text generation with intrinsic quality constraints,and six quality features based on"credibility,expressiveness and elegance".In view of these 6 quality features,we provided analysis and comparison of generator mod-el design and related algorithms.Besides,various automatic and human evaluation methods for different intrinsic quality features are summarized.Finally,this paper looks forward to the future research directions of intrinsic quality constraint technology.

natural language processinglanguage modeltext generationtext qualitytext evaluation

兰玉乾、饶元、李冠呈、孙菱、夏昺灿、辛婷婷

展开 >

西安交通大学软件学院社会智能与复杂数据处理实验室,陕西西安 710049

中国长峰机电技术研究设计院,北京 100854

自然语言处理 语言模型 文本生成 文本质量 文本评价

国家自然科学基金重点项目科技部重点研发计划中央高校建设世界一流大学(学科)和特色发展引导专项

U22B20362019YFB2102300PY3A022

2024

电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
年,卷(期):2024.52(2)
  • 120