首页|基于子音节表征的苗语语音合成方法

基于子音节表征的苗语语音合成方法

扫码查看
少数民族语言的语音合成有助于民族文化的传承、保护和发展,目前相关研究成果较少.针对不同声调的相同词发音相似时易出现语音合成错误的问题,提出了一种基于子音节表征的苗语语音合成方法,该方法利用子音节作为训练基元来表征苗语发音信息,以区分学习不同音节间的相似发音.根据文本序列和梅尔谱图之间对齐的单调性,引入单调对齐损失来指导注意力模块进行更准确的对齐学习,以减少因注意力机制的自回归性带来的跳词、重复等合成现象.为验证所提方法的有效性,以自建苗语语音合成语料库HmongSpeech(下载链接:http://sxjxsf.gzmu.edu.cn/info/1728/1214.htm)作为基准数据集,与典型的语音合成方法进行对比实验.实验结果表明,所提方法能够降低不同声调的相同词发音相似时导致的合成错误率,词错误率仅为0.96%,较基线方法改善了 6.25%.
Sub-syllable Representation-based Hmong Language Text-to-Speech Method
Speech synthesis of minority languages contributes to the preservation,protection and development of national culture,while the research results in this field are currently limited.To address the problem of speech synthesis errors where words with different tones sound similar,a sub-syllable representation-based text-to-speech method for the Hmong language was proposed.The method utilized sub-syllables as training primitives to accurately represent the pronunciation information of the Hmong language,enabling distinctive learning of similar sounds across different syllables.According to the monotonicity of alignment between text sequence and Mel-spectrogram,a monotonic alignment loss was introduced to guide the attention module to learn alignment more accurately,thereby reducing synthesis phenomena such as word skipping and repetition inherent in the autoregressive attention mechanism.To verify the effectiveness of the proposed method,a self-built Hmong language speech synthesis corpus,HmongSpeech(download link:http://sxjxsf.gzmu.edu.cn/info/1728/1214.htm),was utilized as the benchmark dataset.Comparative experiments were conducted with typical speech synthesis methods.The experimental results show that the proposed method successfully reduces the synthetic error rate caused by the similar pronunciation of words with different tones.Notably,the word error rate is only 0.96%,outperforming the baseline method by 6.25%.

Hmong language text-to-speechsub-syllablemonotonic alignmentcorpusMel-spectrogram

蔡姗、王林、谭棉、郭胜、吴磊、王飞

展开 >

贵州民族大学数据科学与信息工程学院,贵阳 550025

贵州省模式识别与智能系统重点实验室,贵阳 550025

贵州民族大学人文科技学院,贵阳 550025

苗语语音合成 子音节 单调对齐 语料库 梅尔谱图

国家自然科学基金贵州省科技计划贵州省科技计划贵州省科技计划贵州省教育厅自然科学研究项目贵州省教育厅自然科学研究项目贵州省教育厅自然科学研究项目贵州省青年科技人才成长项目贵州省青年科技人才成长项目贵州省模式识别与智能系统重点实验室开放课题贵州省模式识别与智能系统重点实验室开放课题贵州省高层次创新型人才项目教育部产学合作协同育人项目

62162012黔科合基础-ZK[2022]一般195黔科合基础-ZK[2023]一般143黔科合平台人才-ZCKJ[2021]007黔教技[2023]061号黔教技[2023]012号黔教技[2022]015号黔教合KY字[2021]115黔教合KY字[2021]110GZMUKL[2022]KF01GZMUKL[2022]KF05黔科合平台人才-GCC[2023]027221001766110209

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(19)
  • 8