End-to-End Endangered Language Speech Recognition Using Language Modeling
An effective way to protect an endangered language is mainly to preserve the voice and video data of the language,and requires native speakers and linguists in the professional field to annotate the corpus.Tujia language is an endangered language without writing.Due to the lack of corpus resources and its unique grammatical structure,not only the accuracy of speech recognition is low,but it only stays at the phonetic level.This paper proposes an end-to-end speech recognition model that integrates the Chinese word-level language model,integrates the language model into the decoding stage of the acoustic model for joint decoding,and outputs Tujia language with Chinese se-quence marks.The model first builds a hybrid speech recognition model based on Attention-CTC;secondly,the TransLM model based on the lexical information-based modeling unit is the word-level IPA sequence,and outputs the translation sequence.Experiments on Tujia speech data show that compared with the Attention-based and CTC-based models,the WER indicators of the model for Tujia language recognition are reduced by 10.3%and 9.6%,re-spectively.The correct rate of the phonetic sequence has been effectively tried.