首页|低资源青岛方言语音识别方法研究

低资源青岛方言语音识别方法研究

扫码查看
方言识别是语音识别的重要研究方向,常见的语音识别系统是基于标准语言训练的,导致其方言识别效果不佳。鉴于此,该文选择青岛方言作为应用案例开展方言语音识别研究。为解决方言语料匮乏、训练深度网络模型困难导致识别准确率受限等问题,提出应用数据增强方法,搭建基于改进Conformer的方言语音识别模型。首先,收集多源语音数据构建方言小型语料库;其次,采用数据增强技术扩充训练数据,以解决语料匮乏问题;最后,为了更好地提取信息,改进Conformer模型的降采样结构,引入膨胀卷积和Mish激活函数,实现语音到文本的直接映射。实验结果表明,提出的改进降采样模块的端到端模型结合数据增强方法后字错率可达25。96%,能有效实现低资源条件下的方言识别。
Research on Low-resource Qingdao Dialect Speech Recognition Method
Dialect recognition is an important research direction in automatic speech recognition.Common speech recognition systems are based on standard language training,which results in poor performance in dialect recognition.In view of this,we choose Qingdao dialect as an application case for dialect speech recognition research.In order to solve the problems of lack of dialect corpus and difficulty in training deep network model,which lead to limited recognition accuracy,we propose to apply data augmentation method and build a dialect speech recognition model based on improved Conformer.Firstly,multi-source speech data is collected to construct a small-scale dialect corpus.Secondly,data augmentation techniques are applied to expand the training data to address the problem of data scarcity.Fi-nally,in order to better extract information,the down-sampling structure of the Conformer model is improved,and dilated convolution and Mish activation function are introduced to realize the direct mapping from speech to text.Experimental results show that the character error rate of the end-to-end model with improved down-sampling module combined with data augmentation method can reach 25.96%,which can effectively realize dialect recognition under low resource conditions.

speech recognitionend-to-endlow resourcedata augmentationQingdao dialect

相紫涵、谷潇、饶崇郅、渐令

展开 >

中国石油大学(华东)经济管理学院,山东 青岛 266580

语音识别 端到端 低资源 数据增强 青岛方言

国家重点研发计划国家重点研发计划

2021YFA10001002021YFA1000102

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(4)
  • 19