面向IVansformer模型的蒙古语语音识别词特征编码方法

扫码查看

原文链接

NETL

中文摘要：针对Transformer模型在蒙古语语音识别任务中无法学习到带有控制符的蒙古语词和语音之间的对应关系，造成模型对蒙古语的不适应问题。提出一种面向Transformer模型的蒙古语词编码方法，方法使用蒙古语字母特征与词特征进行混合编码，通过结合蒙古语字母信息使Transformer模型能够区分带有控制符的蒙古语词，学习到蒙古语词与语音之间的对应关系。在IMUT-MC数据集上，构建Transformer模型并进行了词特征编码方法的消融实验和对比实验。消融实验结果表明，词特征编码方法在HWER、 WER、 SER上分别降低了 23。4%、 6。9%、 2。6%；对比实验结果表明，词特征编码方法领先于所有方法，HWER和WER分别达到11。8%、19。8%。

外文标题：面向IVansformer模型的蒙古语语音识别词特征编码方法

外文摘要：In view of the fact that the Transformer model cannot learn the correspondence between Mongolian words with control symbols and the speech in the Mongolian speech recognition task, which causes the model to not adapt to the Mongolian language. A Mongolian word encoding method for Transformer model is proposed. The method uses Mongolian letter features and word features for mixed encoding. By combining Mongolian letter information, the Transformer model can distinguish Mongolian words with control symbols, and learn Mongolian words and pronunciation. Correspondence between. On the IMUT-MC dataset, the Transformer model is con-structed and the ablation and comparison experiments of word feature encoding methods are carried out. The results of ablation exper-iments show that the word feature encoding method reduces HWER, WER, and SER by 23.4%, 6.9%, and 2.6%, respectively; the comparative experimental results show that the word feature encoding method is ahead of all methods, with HWER and WER reaching 11.8%, 19.8%.

外文关键词：

蒙古语语音识别；Transformer ；注意力机制；词编码

作者：

张晓旭、马志强、刘志强、宝财吉拉呼

展开 >

作者单位：

内蒙古工业大学，呼和浩特，010000

内蒙古工业大学，呼和浩特，010000，内蒙古自治区基于大数据的软件服务工程技术研究中心，呼和浩特，010000

关键词：

蒙古语语音识别；Transformer ；注意力机制；词编码

会议名称：

Chinese national conference on computational linguistic

会议地点：

Nanchang(CN)

会议母体文献：

The 21st Chinese national conference on computational linguistic

页码：

333-343

出版时间：

2022