基于改进DFSMN模型的语音交互服务系统设计

Design of voice interaction service system based on improved DFSMN model

王晓丹 ¹谢先明 ²李活²

扫码查看

作者信息

1. 湖南外国语职业学院,长沙 410211
2. 湖南交通职业技术学院,长沙4101326
折叠

摘要

为了进一步提升服务机器人的自动化语音交互服务质量,提出一种改进的DFSMN-CTC模型,以提升语音交互系统的识别能力.其中,对传统的DFSMN模型中记忆模块的结构以及记忆单元之间的连接方式进行改进,然后将改进得到的模型与CTC进行结合,以实现日语语音的识别.实验结果表明,与其他建模准则构建的语音识别模型以及改进前的DFS-MN模型相比,改进的DFSMN-CTC模型能够取得效果更好的语音交互效果,词错误率分别降低了 6.42％和6.17％;与其他语音识别模型相比,改进的DFSMN-CTC模型在各种实验条件下均能保持最低的平均字错误率,语音识别精度较高.综上,使用改进的DFSMN-CTC模型所构建的日语语音交互系统能够实现效果良好的日语语音交互,实现更好的日语语音交互服务,具有一定的实际使用价值.

Abstract

In order to further enhance the effectiveness of automated voice interaction services for service robots,an improved DF-SMN-CTC model is proposed to improve the recognition performance of voice interaction systems.Among them,the structure of mem-ory modules and the connection between memory units in the traditional DFSMN model have been changed,and the improved model has been combined with CTC to construct the final Japanese speech recognition model.The experimental results show that compared with the speech recognition model constructed using other modeling criteria and the improved DFSMN model,the improved DFSMN-CTC model can achieve better speech recognition interaction effects,with word error rates reduced by 6.42％and 6.17％,respective-ly;Compared with other speech recognition models,the improved DFSMN-CTC model can maintain the lowest average word error rate under various experimental conditions and has good speech recognition performance.In summary,the Japanese speech interaction system constructed using the improved DFSMN-CTC model can achieve effective Japanese speech interaction and better Japanese speech interaction services,which has certain practical value.

关键词

服务机器人/语音交互/DFSMN模型/CTC

Key words

service robots/voice interaction/DFSMN model/CTC

引用本文复制引用

基金项目

教育部职业院校外语类专业教指委职业教育英语课程标准与外语类专业教学标准专项(2022)(WYJZW-2022-19-0229)

出版年

2024

自动化与仪器仪表

重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD

影响因子：0.327

ISSN：1001-9227

参考文献量14

段落导航