基于LSTM的语音字幕转换技术

Speech Subtitle Conversion Technology Based on LSTM

刘俊丽¹

扫码查看

作者信息

1. 襄汾县融媒体中心,山西临汾 041500
折叠

摘要

针对实时语音识别中存在的问题,提出基于长短期记忆(Long Short-Term Memory,LSTM)的语音字幕转换技术.首先介绍网络直播实时字幕生成的总体框架,其次详细阐述LSTM在语音字幕转换中的应用,最后选用Librispeech数据集进行实验.实验结果表明,基于LSTM的语音字幕转换技术在处理多样化的音频数据时具有较高的适应性.

Abstract

Aiming at the problems in real-time speech recognition,a speech subtitle conversion technology based on Long Short-Term Memory(LSTM)is proposed.Firstly,the overall framework of real-time subtitle generation for online live streaming is introduced.Secondly,the application of LSTM in voice subtitle conversion is elaborated in detail.Finally,the Librispeech dataset is used for experiments.The experimental results show that LSTM based speech subtitle conversion technology has high adaptability in processing diverse audio data.

关键词

语音识别/字幕生成/长短期记忆(LSTM)/网络直播

Key words

speech recognition/subtitle generation/Long Short-Term Memory(LSTM)/online live streaming

引用本文复制引用

出版年

2024

电声技术

电视电声研究所(中国电子科技集团公司第三研究所)

电声技术

影响因子：0.259

ISSN：1002-8684

参考文献量10

段落导航