首页|基于多域融合及神经架构搜索的语音增强方法

基于多域融合及神经架构搜索的语音增强方法

扫码查看
为进一步提高语音增强模型的自学习及降噪能力,提出基于多域融合及神经架构搜索的语音增强方法.该方法设计了语音信号多空间域映射及融合机制,实现信号实复数关联关系的挖掘;围绕模型卷积池化运算特点,提出了复数神经架构搜索机制,通过设计的搜索空间、搜索策略及评估策略,高效自动地构建出语音增强模型.实验搜索到的最优语音增强模型与基线模型的对比泛化实验中,语音质量客观评价(PESQ)、短时客观可懂度(STOI)两大指标较最优基线模型均最大提升5.6%,且模型参数量最低.
Speech enhancement method based on multi-domain fusion and neural architecture search
In order to further improve the self-learning and noise reduction ability of speech enhancement model,a speech enhancement method based on multi-domain fusion and neural architecture search was proposed.The mul-ti-spatial domain mapping and fusion mechanism of speech signals were designed to realize the mining of real complex number correlation.Based on the characteristics of convolution pooling of the model,a complex neural architecture search mechanism was proposed,and the speech enhancement model was constructed efficiently and automatically through the designed search space,search strategy and evaluation strategy.In the comparison and generalization experi-ment between the optimal speech enhancement model and the baseline model,the two indexes of PESQ and STOI in-crease by 5.6%compared with the optimal baseline model,and the number of model parameters is the lowest.

speech enhancement modelcomplex spatial domain mappingmulti-domain fusioncomplex neural archi-tecture searchlow-cost evaluation

张睿、张鹏云、孙超利

展开 >

太原科技大学计算机科学与技术学院,山西 太原 030024

语音增强模型 复数空间域映射 多域融合 复数神经架构搜索 低成本评估

国家自然科学基金资助项目教育部人文社会科学研究基金资助项目山西省重点研发计划基金资助项目山西省基础研究计划基金资助项目太原科技大学研究生联合培养示范基地基金资助项目太原科技大学研究生教育创新基金资助项目

6237231923YJCZH29920210202010100220210302123216JD2022004SY2023040

2024

通信学报
中国通信学会

通信学报

CSTPCD北大核心
影响因子:1.265
ISSN:1000-436X
年,卷(期):2024.45(2)
  • 26