一种基于Transformer模型的特征增强算法及其应用研究
A study on a feature enhancement algorithm based on Transformer model and its application
李俊华 1段志奎 1于昕梅1
作者信息
- 1. 佛山科学技术学院电子信息工程学院,广东佛山 528225
- 折叠
摘要
Transformer模型在自动语音识别(ASR)任务中展现出优秀的性能,但在特征提取方面存在两个问题:一是模型集中于全局特征交互信息提取,忽略了其他有用的特征信息,如局部特征交互信息;二是模型对低层特征交互信息的利用不够充分.为了解决这两个问题,提出了卷积线性映射(CMLP)模块以强化局部特征交互,并设计低层特征融合(LF)模块来融合高低层特征.通过整合这些模块,构建了CLformer模型.在两个中文普通话数据集(Aishell-1 和HKUST)上进行实验,结果表明,CLformer显著提升了模型性能,在Aishell-1上较基线提高0.3%,在HKUST上提高0.5%.
Abstract
The Transformer model demonstrates excellent performance in the task of automatic speech recognition(ASR),but there is still room for improvement in feature extraction.This study identifies two main issues with the model:first,it focuses on extracting global feature interactions,overlooking other useful features such as local feature interactions;second,it does not fully utilize low-level feature interactions.To address these issues and enhance the model's performance in ASR tasks,we propose a Convolutional Linear Mapping(CMLP)module to enhance local feature interactions and a Low-level Feature Fusion(LF)module to integrate high-level and low-level features.By integrating these modules,we construct the CLformer model.Experimental results on two Chinese Mandarin datasets(Aishell-1 and HKUST)demonstrate that CLformer significantly improves model performance:by 0.3%on Aishell-1 and 0.5%on HKUST compared to the baseline.This validates the effectiveness of our optimization strategy.
关键词
Transformer模型/自动语音识别/特征增强/局部特征/特征融合Key words
Transformer model/automatic speech recognition/feature fusion/local feature/global feature引用本文复制引用
基金项目
广东省普通高校重点实验室资助项目(2021KSYS008)
出版年
2024