电子学报2024,Vol.52Issue(4) :1282-1287.DOI:10.12263/DZXB.20230905

复杂噪声环境下基于轻量化模型的车内交互语音增强和识别方法

An In-Vehicle Interaction Speech Enhancement and Recognition Method Based on Lightweight Models in Complex Environment

廉筱峪 夏楠 戴高乐 杨红琴
电子学报2024,Vol.52Issue(4) :1282-1287.DOI:10.12263/DZXB.20230905

复杂噪声环境下基于轻量化模型的车内交互语音增强和识别方法

An In-Vehicle Interaction Speech Enhancement and Recognition Method Based on Lightweight Models in Complex Environment

廉筱峪 1夏楠 1戴高乐 1杨红琴1
扫码查看

作者信息

  • 1. 大连工业大学信息科学与工程学院,辽宁大连 116034
  • 折叠

摘要

针对车内语音交互在复杂噪声环境下识别率低以及难以在有限计算资源设备上部署问题,本文设计了轻量化的语音增强模型和语音识别模型并进行联合训练.语音增强模型引入多尺度通道时频注意力模块来提取多尺度时频特征和各个维度上的关键信息.在语音识别模型中提出了多头逐元素线性注意力,显著降低了注意力模块所需的计算复杂度.实验表明,在自制数据集上这一联合训练模型表现出良好的噪声鲁棒性.

Abstract

In order to solve the problem of low recognition rate of in-vehicle voice interaction in complex noise envi-ronment and difficult deployment on devices with limited computing resources,this article proposes a lightweight and ro-bust voice recognition method based on joint training framework in the noisy environment.The speech enhancement model introduces a multi-scale channel time-frequency attention module to extract multi-scale time-frequency features and key in-formation in various dimensions.In the speech recognition model,multi-head element-wise linear attention is proposed,which significantly reduces the computational complexity required for the attention module.Experiments show that the joint training model shows good noise robustness on the self-made dataset.

关键词

深度学习/语音增强/语音识别/注意力机制/联合训练

Key words

deep learning/speech enhancement/speech recognition/attention mechanism/joint training

引用本文复制引用

基金项目

教育部产学合作协同育人项目(220603231024713)

出版年

2024
电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
段落导航相关论文