基于监督学习与深度强化学习的任务型对话模型设计与实现

Design and Implementation of a Task-Oriented Dialogue Model Based on Supervised Learning and Deep Reinforcement Learning

李昱珩 ¹朱彦霞²

扫码查看

作者信息

1. 华东师范大学数学科学学院,上海 200241
2. 河南省职工医院,河南郑州 450000
折叠

摘要

[目的]探讨智能对话系统中任务型对话模型的设计,提出一个基于监督学习和强化学习的任务型对话系统框架.[方法]采用监督学习和强化学习相结合的方法.首先,将开放域对话模型的生成回复嵌入到任务型回复的过程中,构建一个综合的对话模型.其次,利用监督学习和迁移学习的方法,构建对话策略模型,用于指导对话系统的决策过程.最后,采用深度强化学习算法进行优化更新,以提高对话系统的性能.[结果]实验结果表明,任务型对话系统模型在评估指标BLEU、ROUGE和F1分数方面优于其他基准模型.该模型具备良好的对话生成能力和回复多样性,能够生成准确且多样化的回复.[结论]通过综合应用监督学习和强化学习的方法,成功设计了一个基于任务型对话模型的智能对话系统框架.该框架在任务型对话上取得了较好的性能,为智能对话系统的发展提供了有益的探索.

Abstract

[Purposes]This study aims to explore the design of task-oriented dialogue models in intelligent conversational systems and propose a task-oriented dialogue system framework based on supervised learning and reinforcement learning.[Methods]The study adopts a combined approach of supervised learning and reinforcement learning.Firstly,the generation replies from open-domain dialogue models are incorporated into the task-oriented dialogue process,constructing a comprehensive dialogue model.Then,using methods of supervised learning and transfer learning,a dialogue policy model is constructed to guide the decision-making process of the dialogue system.Finally,deep reinforcement learning algorithms are employed for optimization and updates to enhance the performance of the dialogue system.[Findings]Experimental results demonstrate that the task-oriented dialogue system model outperforms other baseline models in evaluation metrics such as BLEU,ROUGE,and F1 scores.The model exhibits good dialogue generation capabilities and response diversity,generating accurate and diverse replies.[Conclusions]The study successfully designs an intelligent dialogue system framework based on task-oriented dialogue models by integrating supervised learning and reinforcement learning.The framework shows promising performance in task-oriented dialogue tasks,providing valuable exploration for the development of intelligent conversational systems.

关键词

任务型对话系统/监督学习/强化学习

Key words

task-oriented dialogue system/supervised learning/reinforcement learning

引用本文复制引用

基金项目

河南省软科学研究计划(222400410151)

河南省软科学研究计划(222400410184)

河南省软科学研究计划(232400411123)

河南省医学科技攻关计划(LHGJ20220248)

河南省重点研发与指导专项科技攻关计划(232102310491)

出版年

2024

河南科技

河南省科学技术信息研究院

河南科技

影响因子：0.615

ISSN：1003-5168

参考文献量17

段落导航