CTC Research on speech recognition of Xinjiang Uyghur accented Mandarin essays
Aiming at the problem that the speech recognition effect of short texts read aloud in Mandarin with Uyghur accent is not ideal,this paper establishes a voice data set of short texts read aloud in Uyghur accent named CH_ESSAY_SET.Through comparative experiments on the self-built dataset and public datasets Aishell_1 and WenetSpeech,as well as speech recognition interfaces such as iFLYTEK,Baidu,Tencent,Yunzhisheng,etc.,it is shown that the end-to-end acoustic model trained based on the self-built dataset is effective for Uyghur Compared with the recognition accuracy of the proposed public dataset and the speech recognition interface,the speech recognition accuracy of the accented Mandarin short text is significantly improved,which verifies the effectiveness of the self-built dataset.The optimization methods of multilingual task training based on transfer learning for feature transfer and the pre-training system based on the framework named WeNet are proposed.The experiments show that the speech recognition accuracy of the proposed optimization method is improved by 8.9%compared with the baseline system,and the word error recognition rate is 7.5%.
speech recognitionUyghur accentend to endcorpus construction