低资源青岛方言语音识别方法研究

Research on Low-resource Qingdao Dialect Speech Recognition Method

扫码查看

原文链接

维普
万方数据

中文摘要：方言识别是语音识别的重要研究方向,常见的语音识别系统是基于标准语言训练的,导致其方言识别效果不佳.鉴于此,该文选择青岛方言作为应用案例开展方言语音识别研究.为解决方言语料匮乏、训练深度网络模型困难导致识别准确率受限等问题,提出应用数据增强方法,搭建基于改进Conformer的方言语音识别模型.首先,收集多源语音数据构建方言小型语料库;其次,采用数据增强技术扩充训练数据,以解决语料匮乏问题;最后,为了更好地提取信息,改进Conformer模型的降采样结构,引入膨胀卷积和Mish激活函数,实现语音到文本的直接映射.实验结果表明,提出的改进降采样模块的端到端模型结合数据增强方法后字错率可达25.96％,能有效实现低资源条件下的方言识别.

外文摘要：Dialect recognition is an important research direction in automatic speech recognition.Common speech recognition systems are based on standard language training,which results in poor performance in dialect recognition.In view of this,we choose Qingdao dialect as an application case for dialect speech recognition research.In order to solve the problems of lack of dialect corpus and difficulty in training deep network model,which lead to limited recognition accuracy,we propose to apply data augmentation method and build a dialect speech recognition model based on improved Conformer.Firstly,multi-source speech data is collected to construct a small-scale dialect corpus.Secondly,data augmentation techniques are applied to expand the training data to address the problem of data scarcity.Fi-nally,in order to better extract information,the down-sampling structure of the Conformer model is improved,and dilated convolution and Mish activation function are introduced to realize the direct mapping from speech to text.Experimental results show that the character error rate of the end-to-end model with improved down-sampling module combined with data augmentation method can reach 25.96％,which can effectively realize dialect recognition under low resource conditions.

外文关键词：

speech recognitionend-to-endlow resourcedata augmentationQingdao dialect

作者：

相紫涵、谷潇、饶崇郅、渐令

展开 >

作者单位：

中国石油大学(华东)经济管理学院,山东青岛 266580

关键词：

语音识别端到端低资源数据增强青岛方言

基金：

国家重点研发计划国家重点研发计划

项目编号：

2021YFA10001002021YFA1000102

出版年：

2024

DOI：

10.20165/j.cnki.ISSN1673-629X.2024.0022

计算机技术与发展

陕西省计算机学会

计算机技术与发展

CSTPCD

影响因子：0.621

ISSN：1673-629X

年,卷(期)：2024.34(4)

参考文献量19