大数据2024,Vol.10Issue(5) :56-73.DOI:10.11959/j.issn.2096-0271.2024014

情感语音合成综述

A survey of emotional speech synthesis

施昊翔 张旭龙 王健宗 程宁 肖京
大数据2024,Vol.10Issue(5) :56-73.DOI:10.11959/j.issn.2096-0271.2024014

情感语音合成综述

A survey of emotional speech synthesis

施昊翔 1张旭龙 2王健宗 2程宁 2肖京2
扫码查看

作者信息

  • 1. 平安科技(深圳)有限公司,广东 深圳 518063;中国科学技术大学,安徽 合肥 230026
  • 2. 平安科技(深圳)有限公司,广东 深圳 518063
  • 折叠

摘要

作为语音领域一个重要的研究方向,语音合成致力于将文本转化为语音.随着深度学习技术的快速发展,语音合成的目的早已不仅仅是合成一段"能听懂"的音频这么简单,情感的加入往往能使语音变得更加具有表现力.基于此,情感语音合成在语音中加入不同的情感并对情感进行调控,以生成灵活且准确的情感语音.从情感语音合成中的几个关键科学问题出发,分别对近几年来基于情感迁移、情感强度控制和情绪混合的发展进行了总结分析,并介绍了情感语音合成的相关数据集和评价指标,最后对情感语音合成进行了展望.

Abstract

As a significant research area in the field of speech technology,speech synthesis is dedicated to converting text into speech.With the rapid development of deep learning technology,the objective of speech synthesis has evolved beyond merely producing"understandable"audio.The incorporation of emotion often enhances the expressiveness of synthesized speech.Consequently,emotional speech synthesis aims to combine speech with different emotions and regulate these emotions to generate flexible and precise emotional speech.Starting from several key issues in emotional speech synthesis,this paper summarizes and analyzes the development based on emotion transfer,emotion intensity control and emotion mixing in recent years,and introduces the relevant data sets and evaluation indicators of emotion speech synthesis.Finally,the emotional speech synthesis is prospected.

关键词

情感语音合成/情感迁移/情感强度/深度学习

Key words

emotional speech synthesis/emotion transfer/emotion intensity/deep learning

引用本文复制引用

基金项目

广东省重点领域研发计划"新一代人工智能"重大专项(2021B0101400003)

出版年

2024
大数据
人民邮电出版社

大数据

CSTPCD
ISSN:2096-0271
段落导航相关论文