轻量化端到端语音指令识别模型研究

Research on Lightweight End-to-End Speech Command Recognition Models

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对智能家居中小词表语音指令识别应用场景的模型小尺寸和低延时的需求,设计了两种轻量化的基于神经网络和连接时序分类算法(CTC)的中文端到端语音指令识别模型.通过精简网络层数和结构实现模型轻量化,引入CTC算法实现以汉字字符作为建模基元的端到端训练和解码,解决数据预对齐问题.最终在公开数据集Aishell-I和自制语料数据集上进行比较,最终得出CNN-CTC模型以 350 kB的模型大小、5 ms的运行速度、5.02%的字错率、92.0%的意图命中率综合评价后,更适用于小词表语音指令识别应用场景.

外文摘要：In response to the demands for small model size and low latency in speech command recognition applications with small vocabularies in smart homes,this paper designs two lightweight Chinese end-to-end command recognition models based on neural networks and connectionist temporal classification(CTC).Model lightness is achieved by simplifying network layers and structures,and CTC algorithm is introduced for end-to-end training and decoding using Chinese characters as modeling units,addressing the data prealignment problem.Finally,Comparative evaluations on the Aishell-I dataset and cus-tom corpora demonstrate that the CNN-CTC model,with a 350kb model size,5ms runtime,5.02%word error rate,and 92.0%intent recognition accuracy,is more suitable for small-vocabulary speech command recognition applications.

外文关键词：

speech command recognitionend-to-endlightweightCTC

作者：

黄晁、赵忆、张从连、袁敏杰、陈春燕

展开 >

作者单位：

宁波中科信息技术应用研究院(宁波人工智能产业研究院),浙江宁波 315040

关键词：

语音指令识别端到端轻量化连接时序分类算法

出版年：

2024

工业控制计算机

中国计算机学会工业控制计算机专业委员会江苏省计算技术研究所有限责任公司

工业控制计算机

影响因子：0.258

ISSN：1001-182X

年,卷(期)：2024.37(8)