基于StarGAN-VC的语音风格转换技术
Speech Style Conversion Technology Based on StarGAN-VC
申少鹏 1胡松涛1
作者信息
- 1. 河南建筑职业技术学院,河南 郑州 450000
- 折叠
摘要
文章基于星型生成式对抗网络-语音转换(Star Generative Adversarial Networks-Voice Conversion,StarGAN-VC)模型,研究了一种先进的语音风格转换技术,旨在实现对语音信号的高效转换.首先,详细阐述了基于StarGAN-VC的语音转换方法的基本原理.其次,深入研究特征提取和基频转换方法,以及StarGAN-VC模型的数学原理.最后,通过在VCC2018数据集上的实验,验证了该方法的性能.实验结果表明,该方法在频谱包络相似度和基频准确度等指标上均取得了令人满意的效果.
Abstract
Based on the Star Generative Adversary Networks-Voice Conversion(StarGAN-VC)model,this paper studies an advanced voice style conversion technology,aiming at achieving efficient conversion of voice signals.Firstly,the basic principle of voice conversion method based on StarGAN-VC is expounded in detail.Secondly,the methods of feature extraction and fundamental frequency conversion and the mathematical principle of StarGAN-VC model are deeply studied.Finally,the performance of this method is verified by experiments on VCC2018 data sets.The experimental results show that this method has achieved satisfactory results in spectral envelope similarity and fundamental frequency accuracy.
关键词
深度学习/语言风格转换/星型生成式对抗网络-语音转换(StarGAN-VC)模型/频谱分析Key words
deep learning/language style conversion/Star Generative Adversarial Networks-Voice Conversion(StarGAN-VC)model/spectral analysis引用本文复制引用
出版年
2024