首页|面向空中交通管制的时频域语音增强技术研究

面向空中交通管制的时频域语音增强技术研究

扫码查看
本研究旨在通过语音增强技术解决空中交通管制通话中的语音干扰问题.通过结合频域降噪和时域增强方法,提出了改进的U-Net模型实现了对管制语音的有效降噪处理.采用 SNR(Signal-to-Noise Ration)、MOS(Mean Opinion Score)来直接评估降噪效果.实验结果显示,与基线U-Net模型相比,改进模型的SNR值提升了4.566 3,达到了7.386 1.鉴于在实际ATC工作环境中难以准确计算SNR,采用了间接评估方法,通过语音识别系统的识别结果来间接衡量模型在真实ATC环境下音频的降噪效果.实验结果表明,经过语音增强处理后的测试音频在语音识别系统中平均字错率降低了1.79%,句错率降低了 3%,改进后的模型能有效改善话音质量提升语音识别系统的识别准确率.
Research on Time-Frequency Domain Speech Enhancement Techniques for Air Traffic Control
This study aims to solve the problem of voice interference in air traffic control communications using voice enhancement technology.By combining frequency domain noise reduction with time domain enhancement methods,this paper proposes an improved U-Net model for effective noise reduction in con-trol voice communications.The noise reduction effectiveness is directly evaluated using SNR(Signal-to-Noise Ratio)and MOS(Mean Opinion Score).Experimental results show that the SNR value of the im-proved model increased by 4.566 3 over the baseline U-Net model,reaching 7.386 1.Given the difficul-ty of accurately calculating SNR in real ATC environments,this paper employs an indirect evaluation method,using the results of a speech recognition system to measure the model's noise reduction effective-ness in actual ATC scenarios.The experimental results indicate that the test audio,after undergoing voice enhancement processing,show a reduction in the average word error rate by 1.79%and in the sentence error rate by 3%within the speech recognition system.The improved model effectively enhances voice quality and increases the accuracy of the speech recognition system.

speech enhancementdeep learningU-netATCASR

李煜琨、孔建国、蒋培元、梁海军

展开 >

中国民用航空飞行学院,四川 广汉 618000

语音增强 深度学习 U-Net ATC ASR

国家重点研发计划中央高校基本科研业务费专项中央高校基本科研业务费专项四川省科技计划

2021YFF0603904PHD2023-035ZHMH2022-0092022YFG0210

2024

航空计算技术
中国航空工业西安航空计算技术研究所

航空计算技术

CSTPCD
影响因子:0.316
ISSN:1671-654X
年,卷(期):2024.54(3)
  • 22