基于ADTDNN的低资源语音识别方法研究

扫码查看

原文链接

万方数据
维普

中文摘要：为解决低资源条件下由于训练数据不足导致识别精度降低、泛化能力较差的问题,提出一种语音识别方法.该方法利用卷积池化提取特征信息,将Attention机制与DTDNN融合成为ADTDNN,以提升低资源环境下模型捕捉序列中关键信息的能力;采用链接时序分类简化模型的识别流程;使用Transformer作为语言模型.在Aishell-1数据集上的实验结果表明,低资源环境下基于ADTDNN的语音识别模型与LAS、Transformer等主流端到端模型相比,字错误率分别降低了3.7%和1.0%.

外文标题：Research on Low-Resource Speech Recognition Based on ADTDNN

外文摘要：A speech recognition approach has been proposed to address the problem of reduced recognition accuracy and poorer generalization performance due to insufficient training data in low-resource conditions.This method leverages convolutional neural networks to extract feature information.It combines the attention mechanism with delayed time-delay neural networks,referred to as ADTDNN,enhancing the model's ability to capture key information in sequences within low-resource environments.The approach employs linking temporal classification to streamline the recognition process of the model.Additionally,a Transformer is utilized as the language model.Experimental results on the Aishell-1 dataset demonstrate that the ADTDNN-based speech recognition model in low-resource settings reduces word error rates by 3.7%and 1%compared to mainstream end-to-end models like LAS and Transformer,respectively.

外文关键词：

speech recognitiontime delay neural networksTransformerdata enhancementlow resource

作者：

顾龙昊、黄连丽、周奎、张子越

展开 >

作者单位：

湖北汽车工业学院电气与信息工程学院

湖北汽车工业学院汽车工程师学院 Sharing-X重点联合实验室,湖北十堰 442002

关键词：

语音识别时延神经网络 Transformer 数据增强低资源

出版年：

2024

DOI：

10.11907/rjdk.232097

软件导刊

湖北省信息学会

软件导刊

影响因子：0.524

ISSN：1672-7800

年,卷(期)：2024.23(9)