Research on Tibetan Driven Visual Speech Synthesis Algorithm Based on Audio Matching
In order to solve the problems of low lip contour detection accuracy and poor visual speech synthesis effect,a Tibetan-driven visual speech synthesis algorithm based on audio matching is proposed.This algorithm extracts short-term energy and short-term zero-crossing rate from Tibetan-language-driven visual speech signal,establishes short-term autocorrelation function of speech signal,and extracts feature information in speech signal,so as to obtain the pitch track of Tibetan speech signal.Secondly,the temporal and spatial analysis model of lip is established to analyze the changing trend of lip contour in the pronunciation process,and the feature of lip contour is extracted by principal component analysis.Finally,the correlation between audio features and lip contour features is obtained through the input-output hidden Markov model,and Tibetan-driven visual speech is synthesized on the basis of audio matching.Experimental results show that the proposed method has high lip contour detection accuracy and good visual speech synthesis effect.