首页|基于数据驱动与深度强化学习的学前教育聊天机器人设计

基于数据驱动与深度强化学习的学前教育聊天机器人设计

扫码查看
为使学前教育聊天机器人更符合用户个性对话交互需求,提出一种基于用户画像与深度强化学习的对话策略模型.模型首先通过门控循环单元对对话动作的历史状态进行建模,并提取包含用户行为特征的动作历史向量;再将提取的动作历史向量与用户画像向量、当前对话状态向量相结合,输入动作价值网络;最终,通过动作价值网络,模型可以找到最符合当前用户个性的最佳回复动作,并生成相应的最佳对话策略.实验结果表明,当对话动作历史窗口大小参数k的取值为3时,所提模型的性能最佳;相较于DQN、DRQN和Dueling等当前对话系统中常用的基于深度强化学习的对话策略模型,所提模型在对话成功率、对话平均奖励和平均对话轮数等指标上分别达到0.45、17.93、26.22,提升效果明显,对话的质量和效率更佳,值得进一步推广和研究.
Design of pre-school education chatbot based on data-driven and deep reinforcement learning
In order to enable pre-school education chatbots to provide more personalized dialogue interaction services,a dialogue strategy model based on user portrait and deep reinforcement learning is proposed.Firstly,the history state of the dialogue action is modeled through the gated loop unit,and the action history vector containing the user behavior characteristics is extracted.Then the extracted action history vector is combined with the user portrait vector and the current conversation state vector to input the action val-ue network.Finally,through the action value network,the model can find the best response action that is most in line with the current user's personality,and generate the corresponding best dialogue strategy.The experimental results show that the performance of the proposed model is best when the parameter k of the dialog action history window is 3.Compared with the dialogue strategy models based on deep reinforcement learning commonly used in current dialogue systems such as DQN,DRQN and Dueling,the proposed model achieves 0.45,17.93 and 26.22,respectively,in terms of the dialogue success rate,average dialogue reward and average number of dialogue rounds,with obvious improvement effect and better dialogue quality and efficiency.It is worth further populariza-tion and research.

deep reinforcement learninguser portraitpreschool educationdialogue interactionneural network

谭琳霞、李阿红

展开 >

咸阳职业技术学院,陕西咸阳 712000

深度强化学习 用户画像 学前教育 对话交互 神经网络

咸阳职业技术学院2021年科研创新团队学前儿童核心经验发展科研创新团队

CX202103

2024

自动化与仪器仪表
重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD
影响因子:0.327
ISSN:1001-9227
年,卷(期):2024.(6)
  • 14