基于迁移学习的小样本语言语音识别研究

Research on Small Sample Language Speech Recognition Based on Transfer Learning

赵泽彬 ¹兰亮 ²姜丹 ¹王大亮³

扫码查看

作者信息

1. 北京印刷学院信息工程学院,北京 102600
2. 中国电信股份有限公司四川分公司科技创新部,成都 610041
3. 数据堂(北京)科技股份有限公司AI创新中心,北京 100192
折叠

摘要

本文提出了面向小样本语言的语音识别迁移学习方法,探讨该方法的实现与效果.为了克服目前小样本语言语音识别常见数据样本不足、数据质量过低、词典缺乏等问题,立足迁移学习算法思想,提出迭代性语言模型构建方法,旨在提高语音识别模型的识别性能.迭代性语言模型构建方法包括对普通话发音词典和文本语料进行特殊方言化处理,并加以可迭代训练流程规范化处理,从语言学角度构建西南官话独有文本语料,语言模型成功提高了预测率.对比实验结果表明,迁移学习模型在普通话和西南官话数据集上均表现出较好的字错率,最终西南官话语音识别结果字错率低于 14.4%,在AISHELL-1 普通话公共数据集上的字错率为 5.50%,为目前同期模型最优识别结果,实现了从普通话到西南官话的知识迁移.

Abstract

The paper proposes a transfer learning approach for small sample language speech recognition and investigates its implementation and effectiveness.In order to overcoming challenges such as insufficient data samples,low data quality,and the absence of suitable dictionaries in small sample language speech recognition,the research is grounded in the principles of transfer learning algorithms and introduces a method involving specialized dialectal processing of Mandarin pronunciation dictionaries and text corpora.The approach follows an iterative training process,which results in the creation of unique text corpora tailored specifically to Southwest Mandarin from a linguistic perspective.The language model demonstrates a significant improvement in prediction accuracy.The results of comparative experiments reveal that the transfer learning model performs well in terms of character error rates on both Mandarin and Southwest Mandarin datasets.Ultimately,the character error rate for Southwestern Mandarin speech recognition results falls below 14.4%,reaching 5.50%on the AISHELL-1 Mandarin public dataset.This accomplishment stood as the best recognition result among models of the same period,showcasing the successful transfer of knowledge from Mandarin to Southwest Mandarin.

关键词

语音识别/神经网络/迁移学习/小样本/方言

Key words

ASR/neural network/transfer learning/small sample/dialect

引用本文复制引用

基金项目

北京市自然科学基金()

北京市教委科技重点项目(KZ202010015021)

专业学位研究生联合培养基地建设项目(21090223001)

北京印刷学院博士启动金项目(27170123036)

出版年

2024

北京印刷学院学报

北京印刷学院

北京印刷学院学报

影响因子：0.247

ISSN：1004-8626

参考文献量1

段落导航