首页|双分支复频谱下多特征聚合的轻量化语音增强方法

双分支复频谱下多特征聚合的轻量化语音增强方法

扫码查看
针对目前多种改进的卷积循环网络(CRN)在单掩蔽或单映射的编解码结构下提取特征单一、捕获全局特征不强、参数量较大等问题,提出一种多特征聚合卷积模块与高效Transformer融合注意力机制结合的复频谱联合掩蔽和映射的单通道语音增强高效网络.在编解码层设计一种双分支门控协作单元(DGCU),提取复频谱多层次特征后交互、聚合以弥补特征提取单一问题;中间层设计一种通道时频注意力融合模块,聚焦语音的时频、空间局部细节特征.最后在THCHS30数据集上进行消融和对比实验,实验结果表明,该网络以最低参数量、较低计算量实现了轻量化,在匹配和不匹配噪声下PESQ分别提升了 10.5%~50.6%、16.3%~94.5%,客观、主观指标都优于其他对比的网络模型,表现出较高的降噪性能和网络泛化能力.
A lightweight speech enhancenment method based on dual branch complex spectrum with multiple feature aggregation
To address the issues with current variations of Convolution Recurrent Networks(CRN),which often extract limited features,capture global characteristics poorly,and have large parameter sizes under single masking or mapping encoder-decoder structures,this paper proposes an efficient single-channel speech enhancement network.This network combines a multi-feature aggregation convolution module,leveraging complex spectrum joint masking and mapping,with an efficient Transformer-based attention mechanism.In the encoder-decoder layer,a Dual-branch Gated Cooperative Unit(DGCU)is designed to interact and aggregate multi-level complex spectral features,addressing the problem of singular feature extraction.The intermediate layer incorporates a Channel-Time-Frequency Attention Fusion Module,focusing on spatial and time-frequency local detail features of speech.Ablation and comparative experiments on the THCHS30 dataset demonstrate that this network achieves lightweight efficiency with the lowest parameter count and relatively low computational cost.It improves PESQ by 10.5%~50.6%and 16.3%~94.5%under matched and mismatched noise conditions,respectively.Both objective and subjective metrics outperform other comparative network models,exhibiting superior noise reduction performance and network generalization capability.

speech enhancementcomplex spectral mapping and maskingmultiple feature aggregationefficient Transformerlightweight

张天骐、沈夕文、唐娟、谭霜

展开 >

重庆邮电大学通信与信息工程学院 重庆 400065

语音增强 复频谱掩蔽和映射 多特征聚合 高效Transformer 轻量化

重庆市自然科学基金

cstc2021jcyjmsxmX0836

2024

仪器仪表学报
中国仪器仪表学会

仪器仪表学报

CSTPCD北大核心
影响因子:2.372
ISSN:0254-3087
年,卷(期):2024.45(7)
  • 4