首页|Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States

Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States

扫码查看
In this paper, we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-hour speech data were used for training and 3-hour data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.

Acoustic model (AM)Gaussian mixture model (GMM)hidden Markov model (HMM)language model (LM)linear predictive coding (LPC)linear prediction cepstral coefficient (LPCC)Mel frequency cepstral coefficient (MFCC)perceptual linear prediction (PLP).

Bronson Syiem、Sushanta Kabir Dutta、Juwesh Binong、Lairenlakpam Joyprakash Singh

展开 >

Department of Electronics and Communication Engineering,North-Eastern Hill University,Shillong 793022

This work was supported by the Visvesvaraya Ph.D.Scheme for Electronics and IT students launched by the Ministry of Electronics

PhD-MLA/495/2015-2016

2021

电子科技学刊
电子科技大学

电子科技学刊

CSCD
影响因子:0.154
ISSN:1674-862X
年,卷(期):2021.19(2)
  • 20