首页|Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition?

Using Highway Connections to Enable Deep Small-footprint LSTM-RNNs for Speech Recognition?

扫码查看
Long short-term memory RNNs (LSTM-RNNs) have shown great success in the Automatic speech recognition (ASR) field and have become the state-of-the-art acoustic model for time-sequence modeling tasks. However, it is still difficult to train deep LSTM-RNNs while keeping the parameter number small. We use the highway connections between memory cells in adjacent layers to train a small-footprint highway LSTM-RNNs (HLSTM-RNNs), which are deeper and thinner compared to conventional LSTM-RNNs. The experiments on the Switchboard (SWBD) indicate that we can train thinner and deeper HLSTM-RNNs with a smaller parameter number than the conventional 3-layer LSTM-RNNs and a lower Word error rate (WER) than the conventional one. Compared with the counterparts of small-footprint LSTM-RNNs, the small-footprint HLSTM-RNNs show greater reduction in WER.

Long short-term memoryHighway connectionsSmall-footprintSpeech recognition

CHENG Gaofeng、LI Xin、YAN Yonghong

展开 >

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

This work is supported by the National Key Research and Development ProgramThis work is supported by the National Key Research and Development ProgramNational Natural Science Foundation of ChinaNational Natural Science Foundation of China

2016YFB08012032016YFB08012001159077411590770

2019

中国电子杂志(英文版)

中国电子杂志(英文版)

CSTPCDCSCDSCIEI
ISSN:1022-4653
年,卷(期):2019.28(1)
  • 1
  • 24