计算机工程与设计2024,Vol.45Issue(7) :2173-2179.DOI:10.16208/j.issn1000-7024.2024.07.034

协调语音能量区域的正则化优化算法

Regularization optimization algorithm for coordinating speech energy region

师晨康 薛珮芸 白静 赵建星
计算机工程与设计2024,Vol.45Issue(7) :2173-2179.DOI:10.16208/j.issn1000-7024.2024.07.034

协调语音能量区域的正则化优化算法

Regularization optimization algorithm for coordinating speech energy region

师晨康 1薛珮芸 2白静 1赵建星1
扫码查看

作者信息

  • 1. 太原理工大学信息与计算机学院,山西晋中 030600
  • 2. 太原理工大学信息与计算机学院,山西晋中 030600;山西高等创新研究院博士后科研工作站,山西太原 030032
  • 折叠

摘要

为有效解决语音识别模型过拟合问题,提出一种协调语音能量区域的正则化优化算法.根据语音的共振峰特性,对语音信号高能量区域进行集体失活处理,增加模型对语音信号低能量区域的关注度;为进一步提升声学模型性能,采用堆叠8层的门控卷积神经网络提取语音时序特征,并对其中的门控机制进行优化,缓解梯度衰减现象;采用联结时序分类算法以汉字为建模单元对语音识别模型进行训练和解码.在公开中文语音数据集Aishell-1上的实验结果表明,该语音识别模型字错率降低至11.27%,与基线模型相比,字错率下降了 7.93%,验证了该方法的有效性.

Abstract

To effectively solve the overfitting problem of the speech recognition model,a regularized optimization algorithm for coordinating speech energy regions was proposed.The high-energy areas of the speech signal were collectively dropped according to the resonance peak characteristics,increasing the model's focus on the low-energy areas of the speech signal.To further improve the acoustic model performance,a gated convolutional neural network(GCNN)with stacked eight layers was used to extract speech timing features,and the gating mechanism in it was optimized to alleviate the gradient fading phenomenon effec-tively.The connectionist temporal classification(CTC)algorithm was used to train and decode the speech recognition model with Chinese characters as the modeling unit.Experimental results on Aishell-1,an open Chinese speech dataset,show that the word error rate of the speech recognition model is reduced to 11.27%,and the word error rate is reduced by 7.93%compared with the baseline model,which verifies the effectiveness of the method.

关键词

语音识别/声学模型/语音能量区域/正则化/卷积神经网络/联结时序分类/深度学习

Key words

speech recognition/acoustic model/voice energy area/regularization/convolutional neural networks/connectionist temporal classification/deep learn

引用本文复制引用

基金项目

山西省应用基础研究计划基金项目(201901D111094)

山西省基础研究基金项目(青年)(20210302124544)

山西省留学回国人员科技活动择优基金项目(20200017)

出版年

2024
计算机工程与设计
中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心
影响因子:0.617
ISSN:1000-7024
段落导航相关论文