Research on Low-resource Qingdao Dialect Speech Recognition Method
Dialect recognition is an important research direction in automatic speech recognition.Common speech recognition systems are based on standard language training,which results in poor performance in dialect recognition.In view of this,we choose Qingdao dialect as an application case for dialect speech recognition research.In order to solve the problems of lack of dialect corpus and difficulty in training deep network model,which lead to limited recognition accuracy,we propose to apply data augmentation method and build a dialect speech recognition model based on improved Conformer.Firstly,multi-source speech data is collected to construct a small-scale dialect corpus.Secondly,data augmentation techniques are applied to expand the training data to address the problem of data scarcity.Fi-nally,in order to better extract information,the down-sampling structure of the Conformer model is improved,and dilated convolution and Mish activation function are introduced to realize the direct mapping from speech to text.Experimental results show that the character error rate of the end-to-end model with improved down-sampling module combined with data augmentation method can reach 25.96%,which can effectively realize dialect recognition under low resource conditions.