Research on Low-Resource Speech Recognition Based on ADTDNN
A speech recognition approach has been proposed to address the problem of reduced recognition accuracy and poorer generalization performance due to insufficient training data in low-resource conditions.This method leverages convolutional neural networks to extract feature information.It combines the attention mechanism with delayed time-delay neural networks,referred to as ADTDNN,enhancing the model's ability to capture key information in sequences within low-resource environments.The approach employs linking temporal classification to streamline the recognition process of the model.Additionally,a Transformer is utilized as the language model.Experimental results on the Aishell-1 dataset demonstrate that the ADTDNN-based speech recognition model in low-resource settings reduces word error rates by 3.7%and 1%compared to mainstream end-to-end models like LAS and Transformer,respectively.