通信学报2024,Vol.45Issue(11) :15-26.DOI:10.11959/j.issn.1000-436x.2024197

基于混合混响模型的多通道语音增强算法

Multichannel speech enhancement algorithm based on hybrid reverberation model

解元 邹涛 孙为军 谢胜利
通信学报2024,Vol.45Issue(11) :15-26.DOI:10.11959/j.issn.1000-436x.2024197

基于混合混响模型的多通道语音增强算法

Multichannel speech enhancement algorithm based on hybrid reverberation model

解元 1邹涛 1孙为军 2谢胜利2
扫码查看

作者信息

  • 1. 广州大学机械与电气工程学院,广东 广州 510006
  • 2. 广东工业大学物联网智能信息处理与系统集成教育部重点实验室,广东 广州 510006
  • 折叠

摘要

为了解决带混响和噪声场景下的语音增强问题,构建了一个集成多通道线性预测模型和空间相干模型的语音增强模型,设计了一种基于混合混响模型的多通道语音增强算法.该算法将后期混响分为2个分量,分别用多通道线性预测模型和空间相干模型来建模,为优化模型参数,利用卡尔曼滤波器实施更新模型参数,并用多项式矩阵特征值分解进行空间、时间和频率解相关,实现去混响去噪声.实验结果表明,所提算法可以实现高低混响带噪声环境下的语音增强,相比于流行的语音增强算法,其增强效果更优越,其中语音质量客观评价(PESQ)值和短时客观可懂度(STOI)值最高分别提高了30%和20%.

Abstract

To solve the speech enhancement problem in reverberation and noise scenarios,a new speech enhancement model was constructed integrating multichannel linear prediction model and spatial coherence model,and then a multi-channel speech enhancement algorithm based on a hybrid reverberation model was designed.The post-reverberation was divided into two components,which were modeled using a multichannel linear prediction model and a spatial coherence model,respectively.To optimize the model parameters,a Kalman filter was used to update the model parameters and polynomial matrix eigenvalue decomposition was used for spatial,temporal,and frequency decorrelation to achieve re-verberation and noise reduction.Experimental results show that the proposed algorithm can enhance speech in high and low-reverberation noise environments,and its enhancement effect is superior to popular speech enhancement algorithms,the performance indicators of speech enhancement,perceptual evaluation of speech quality score(PESQ)value and short-time objective intelligibility(STOI)value,have increased by 30%and 20%,respectively.

关键词

多通道语音增强/卡尔曼滤波器/多项式矩阵特征值分解

Key words

multichannel speech enhancement/Kalman filter/polynomial matrix eigenvalue decomposition

引用本文复制引用

出版年

2024
通信学报
中国通信学会

通信学报

CSTPCD北大核心
影响因子:1.265
ISSN:1000-436X
段落导航相关论文