GAN在电动汽车主动发声系统中的应用研究

Research on the Application of Generative Adversarial Network in the Sound Synthesis System of Electric Vehicles

梁凯 ¹张巍 ¹赵海军²

扫码查看

作者信息

1. 洛阳理工学院信息化技术中心,河南洛阳 471023
2. 天津职业技术师范大学智能车路协同与安全技术国家地方联合工程研究中心,天津 300222
折叠

摘要

为提高电动汽车引擎拟音的个性化效果和质量,引入生成对抗网络(GAN)模型,构建了电动汽车的GAN主动发声模型,设计了模型中各层网络的结构和卷积核大小,利用自适应时刻估计算法优化网络各层权重,并将模型用于样本生成试验.在模型训练中提出一种相位扰动操作,用于解决上采样操作产生音调噪声的问题;为证明GAN模型中不同输入信号的性能差异,构建了基于二维声谱图输入的GAN模型,并用于对照试验.试验结果表明:模型可准确地学习到原始音频信号的特征分布;人耳听觉测试结果显示,生成的声音样本真实度在90%以上;基于留一法(LOO)的 1-NN 分类评价结果显示,原生音频和二维声谱图 GAN 模型的LOO精度均大于或接近 50%,表明模型训练未产生过度拟合,采用本文方法生成音效真实可靠.

Abstract

To improve the personalization and quality of the sound imitation of electric vehicle en-gines,a generative adversarial networks(GAN)model was introduced to construct the GAN active sound model of electric vehicles.The structure of each layer of the network and the size of the con-volution kernel in the model were designed.The adaptive moment estimation algorithm was used to optimize the weights of each layer in the network.The model was used for sample generation exper-iments.A phase perturbation operation was proposed in model training to solve the problem of pitch noise generated by the upsampling operation.In order to prove the performance of different input signals in the GAN model,a GAN model based on two-dimensional spectrogram input was con-structed and used for controlled trials.The test results show that the model can accurately learn the feature distribution of the original audio signal.The human hearing test results show that the authen-ticity of the generated sound samples is more than 90%.The 1-NN classification evaluation results based on the leave-one-out method(LOO)show the LOO accuracy of the native audio and two-di-mensional spectrogram GAN models are both greater than or close to 50%,indicating that model training does not produce overfitting,the method proposed in this paper is true and reliable in gener-ating sound effects.

关键词

电动汽车/主动发声/生成对抗网络/原生音频/声谱图

Key words

electric vehicles/initiative sound production/generative adversarial network/raw audio/spectrum

引用本文复制引用

基金项目

国家自然科学基金(U1604141)

中国高校产学研创新基金(2021ITA07021)

出版年

2024

沈阳理工大学学报

沈阳理工大学

沈阳理工大学学报

影响因子：0.223

ISSN：1003-1251

参考文献量18

段落导航