基于StarGAN-VC的语音风格转换技术

Speech Style Conversion Technology Based on StarGAN-VC

申少鹏 ¹胡松涛¹

扫码查看

作者信息

1. 河南建筑职业技术学院,河南郑州 450000
折叠

摘要

文章基于星型生成式对抗网络-语音转换(Star Generative Adversarial Networks-Voice Conversion,StarGAN-VC)模型,研究了一种先进的语音风格转换技术,旨在实现对语音信号的高效转换.首先,详细阐述了基于StarGAN-VC的语音转换方法的基本原理.其次,深入研究特征提取和基频转换方法,以及StarGAN-VC模型的数学原理.最后,通过在VCC2018数据集上的实验,验证了该方法的性能.实验结果表明,该方法在频谱包络相似度和基频准确度等指标上均取得了令人满意的效果.

Abstract

Based on the Star Generative Adversary Networks-Voice Conversion(StarGAN-VC)model,this paper studies an advanced voice style conversion technology,aiming at achieving efficient conversion of voice signals.Firstly,the basic principle of voice conversion method based on StarGAN-VC is expounded in detail.Secondly,the methods of feature extraction and fundamental frequency conversion and the mathematical principle of StarGAN-VC model are deeply studied.Finally,the performance of this method is verified by experiments on VCC2018 data sets.The experimental results show that this method has achieved satisfactory results in spectral envelope similarity and fundamental frequency accuracy.

关键词

深度学习/语言风格转换/星型生成式对抗网络-语音转换(StarGAN-VC)模型/频谱分析

Key words

deep learning/language style conversion/Star Generative Adversarial Networks-Voice Conversion(StarGAN-VC)model/spectral analysis

引用本文复制引用

出版年

2024

电声技术

电视电声研究所(中国电子科技集团公司第三研究所)

电声技术

影响因子：0.259

ISSN：1002-8684

参考文献量10

段落导航