Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

外文摘要：In this paper, we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-hour speech data were used for training and 3-hour data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.

外文关键词：

Acoustic model (AM)Gaussian mixture model (GMM)hidden Markov model (HMM)language model (LM)linear predictive coding (LPC)linear prediction cepstral coefficient (LPCC)Mel frequency cepstral coefficient (MFCC)perceptual linear prediction (PLP).

作者：

Bronson Syiem、Sushanta Kabir Dutta、Juwesh Binong、Lairenlakpam Joyprakash Singh

展开 >

作者单位：

Department of Electronics and Communication Engineering,North-Eastern Hill University,Shillong 793022

基金：

This work was supported by the Visvesvaraya Ph.D.Scheme for Electronics and IT students launched by the Ministry of Electronics

项目编号：

PhD-MLA/495/2015-2016

出版年：

2021

DOI：

10.1016/j.jnlest.2020.100079

电子科技学刊

电子科技大学

电子科技学刊

CSCD

影响因子：0.154

ISSN：1674-862X

年,卷(期)：2021.19(2)

参考文献量20