首页|基于深度学习的构音障碍人群语音事件识别模型

基于深度学习的构音障碍人群语音事件识别模型

扫码查看
构音障碍是许多患有特殊疾病的患者所要面对的问题,会造成说话人发音不清晰.为更好地理解构音障碍患者所表达的语音事件,文章提出了一个新的基于深度学习的语音事件识别模型.模型以语音片段作为输入,运用格拉姆角场保留了时间序列的原始特征,利用Conformer对序列的局部特征和全局特征进行提取,最后用ResNet作为分类模型.在EasyCall corpus数据集上的实验结果表明,文章提出的模型具有良好的识别效果.
Speech Event Recognition Model based on Deep Learning for People with Dysarthria
Dysarthria is a problem faced by many patients with specific disorders that cause speakers to have unclear articulation.To better understand the speech events expressed by patients with dysarthria,this paper proposes a new deep learning-based speech event recognition model.The model takes speech fragments as input,preserves the original features of the time series using Gramian corner field,extracts local and global features of the sequences using Conformer,and finally uses ResNet as a classification model.The experimental results on EasyCall corpus dataset show that the model proposed in this paper has good recognition effect.

voice event recognitionGramian Corner FieldConformerResNet

杨熠、张子鹏

展开 >

中国矿业大学孙越崎学院,江苏 徐州 221008

语音事件识别 格拉姆角场 Conformer ResNet

中国矿业大学国家级"大学生创新训练计划"项目

202110290091Z

2024

电脑与信息技术
中国电子学会,湖南省电子研究所

电脑与信息技术

影响因子:0.256
ISSN:1005-1228
年,卷(期):2024.32(1)
  • 16