Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition?

扫码查看

原文链接

NETL
NSTL
万方数据

外文摘要：Long short-term memory RNNs (LSTM-RNNs) have shown great success in the Automatic speech recognition (ASR) field and have become the state-of-the-art acoustic model for time-sequence modeling tasks. However, it is still difficult to train deep LSTM-RNNs while keeping the parameter number small. We use the highway connections between memory cells in adjacent layers to train a small-footprint highway LSTM-RNNs (HLSTM-RNNs), which are deeper and thinner compared to conventional LSTM-RNNs. The experiments on the Switchboard (SWBD) indicate that we can train thinner and deeper HLSTM-RNNs with a smaller parameter number than the conventional 3-layer LSTM-RNNs and a lower Word error rate (WER) than the conventional one. Compared with the counterparts of small-footprint LSTM-RNNs, the small-footprint HLSTM-RNNs show greater reduction in WER.

外文关键词：

Long short-term memoryHighway connectionsSmall-footprintSpeech recognition

作者：

CHENG Gaofeng、LI Xin、YAN Yonghong

展开 >

作者单位：

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

基金：

This work is supported by the National Key Research and Development ProgramThis work is supported by the National Key Research and Development ProgramNational Natural Science Foundation of ChinaNational Natural Science Foundation of China

项目编号：

2016YFB08012032016YFB08012001159077411590770

出版年：

2019

DOI：

10.1049/cje.2018.11.008

中国电子杂志(英文版)

CSTPCDCSCDSCIEI

ISSN：1022-4653

年,卷(期)：2019.28(1)

被引量1
参考文献量24