复杂噪声环境下基于轻量化模型的车内交互语音增强和识别方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：针对车内语音交互在复杂噪声环境下识别率低以及难以在有限计算资源设备上部署问题,本文设计了轻量化的语音增强模型和语音识别模型并进行联合训练.语音增强模型引入多尺度通道时频注意力模块来提取多尺度时频特征和各个维度上的关键信息.在语音识别模型中提出了多头逐元素线性注意力,显著降低了注意力模块所需的计算复杂度.实验表明,在自制数据集上这一联合训练模型表现出良好的噪声鲁棒性.

外文标题：An In-Vehicle Interaction Speech Enhancement and Recognition Method Based on Lightweight Models in Complex Environment

外文摘要：In order to solve the problem of low recognition rate of in-vehicle voice interaction in complex noise envi-ronment and difficult deployment on devices with limited computing resources,this article proposes a lightweight and ro-bust voice recognition method based on joint training framework in the noisy environment.The speech enhancement model introduces a multi-scale channel time-frequency attention module to extract multi-scale time-frequency features and key in-formation in various dimensions.In the speech recognition model,multi-head element-wise linear attention is proposed,which significantly reduces the computational complexity required for the attention module.Experiments show that the joint training model shows good noise robustness on the self-made dataset.

外文关键词：

deep learningspeech enhancementspeech recognitionattention mechanismjoint training

作者：

廉筱峪、夏楠、戴高乐、杨红琴

展开 >

作者单位：

大连工业大学信息科学与工程学院,辽宁大连 116034

关键词：

深度学习语音增强语音识别注意力机制联合训练

基金：

教育部产学合作协同育人项目

项目编号：

220603231024713

出版年：

2024

DOI：

10.12263/DZXB.20230905

电子学报

中国电子学会

电子学报

CSTPCD北大核心

影响因子：1.237

ISSN：0372-2112

年,卷(期)：2024.52(4)