首页|基于主成分的频谱迭代稀疏化语音增强方法

基于主成分的频谱迭代稀疏化语音增强方法

扫码查看
针对现有频谱稀疏化方法在复杂环境语音增强上性能不佳的问题,提出一种基于主成分分析的迭代频谱稀疏化方法。首先,对输入信号的语谱图进行二维中值滤波处理,得到行分量频谱和列分量频谱;对包含语音主音的行分量频谱序列进行主成分分析(PCA),以去除噪声部分并保留主要语音结构;然后联合列分量频谱序列和缩放因子进行混合重构原信号,并采用动态缩放因子实现对列分量频谱噪声的有效控制。在此基础上,利用稀疏化对噪声的抑制作用,对频谱进行多次稀疏化,以减弱噪声。实验结果表明,该方法增强了不同类型噪声下语音的信噪比,包括White、Pink、Babble、Volvo和Factory等五种噪声,输入信噪比为15 dB,所提方法的信噪比分别提升了13。89 dB,11。97 dB,5。65 dB,5。26 dB和4。73 dB,该方法在其他信噪比下也能有效地抑制噪声和保留有效特征信息,并减少因背景噪声引起的语音失真。
Principal component-based spectral iteration sparsity for speech enhancement
To address the inadequate performance of existing spectrum sparsity methods in speech enhance-ment under complex environments,an iterative spectrum sparsity method based on Principal Component Analysis(PCA)is proposed.First,the spectrogram of the input signal is processed by two-dimensional me-dian filtering,yielding the row component spectrum and column component spectrum.PCA is then applied to the row component spectrum sequence containing the main vocal part of the speech,to eliminate noise and preserve the main speech structure.Next,the column component spectrum sequence is combined with scaling factors for speech signal reconstruction.A dynamic scaling factor is employed to effectively control the noise present in the column component spectrum sequence.Based on this the proposed method utilizes the noise suppression effect of sparsification to perform multiple sparsifications on the spectrum to reduce noise.Experi-mental results demonstrate the improved performance,with average improvements in the signal-to-noise ratio of 13.89 dB,11.97 dB,5.65 dB,5.26 dB,and 4.73 dB,respectively,for different types of noise includ-ing White,Pink,Babble,Volvo,and Factory noise,when the input signal-to-noise ratio is set at 15 dB.Moreover,the proposed method also effectively suppresses noise while retaining important speech features across other signal-to-noise ratios,and reduces speech distortion caused by background noise.

Speech enhancementMultidimensional spectrum analysisSpectral sparsityPrincipal compo-nent analysis

董娴、邵玉斌、杜庆治、龙华、马迪南

展开 >

昆明理工大学信息工程与自动化学院,昆明 650500

云南省媒体融合重点实验室,昆明 650500

语音增强 多维度频谱分析 谱稀疏化 主成分分析

云南省媒体融合重点实验室项目

320225403

2024

四川大学学报(自然科学版)
四川大学

四川大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.358
ISSN:0490-6756
年,卷(期):2024.61(3)