首页|基于多特征交互融合的老挝语无监督音素分割方法

基于多特征交互融合的老挝语无监督音素分割方法

扫码查看
针对现有方法对老挝语声调变化以及音频多样性考虑不足导致音素分割不准确的问题,提出一种多特征交互融合的老挝语无监督音素分割方法。先对自监督特征、频谱特征以及音高特征进行独立编码,避免单一特征的不足;再基于注意力机制渐进融合多种独立特征,使模型更全面地捕捉老挝语的声调变化和音素边界的信息;最后采用可学习框架优化音素分割模型。实验结果表明,相比基线方法,在老挝语音素分割任务上所提方法的R-value 值提升了 27。88%。
An unsupervised phoneme segmentation method for Lao language with multi-feature interaction fusion
Aiming at the inaccurate phoneme segmentation problem caused by the lack of considera-tion of Lao language tone changes and audio diversity in existing methods,this paper proposes an unsu-pervised phoneme segmentation method for Lao language with multi-feature interaction fusion.Firstly,self-supervised features,spectral features and pitch features are independently coded to avoid the insuffi-ciency of a single feature.Secondly,multiple independent features are gradually fused based on the at-tention mechanism,so that the model can more comprehensively capture the information of Lao lan-guage tone changes and phoneme boundaries.Finally,a learnable framework is adopted to optimize the phoneme segmentation model.The experimental results show that the proposed method improves the R-value by 27.88%on the Lao phoneme segmentation task compared with the baseline methods.

unsupervised learningfeature fusionLao languagephoneme segmentationspeech rep-resentation

李新洁、王文君、董凌、赖华、余正涛、高盛祥

展开 >

昆明理工大学信息工程与自动化学院,云南 昆明 650500

昆明理工大学云南省人工智能重点实验室,云南 昆明 650500

无监督学习 特征融合 老挝语 音素分割 语音表征

国家自然科学基金国家自然科学基金国家自然科学基金国家自然科学基金云南省重点研发计划云南省重点研发计划云南省重点研发计划云南省重点研发计划云南省科技人才与平台计划

62376111U23A20388U21B202762366027202303AP140008202302AD080003202401BC070021202103AA080015202105AC160018

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(5)
  • 25