一种基于Transformer模型的特征增强算法及其应用研究

A study on a feature enhancement algorithm based on Transformer model and its application

李俊华 段志奎 于昕梅

一种基于Transformer模型的特征增强算法及其应用研究

A study on a feature enhancement algorithm based on Transformer model and its application

李俊华 1段志奎 1于昕梅1
扫码查看

作者信息

  • 1. 佛山科学技术学院电子信息工程学院,广东佛山 528225
  • 折叠

摘要

Transformer模型在自动语音识别(ASR)任务中展现出优秀的性能,但在特征提取方面存在两个问题:一是模型集中于全局特征交互信息提取,忽略了其他有用的特征信息,如局部特征交互信息;二是模型对低层特征交互信息的利用不够充分.为了解决这两个问题,提出了卷积线性映射(CMLP)模块以强化局部特征交互,并设计低层特征融合(LF)模块来融合高低层特征.通过整合这些模块,构建了CLformer模型.在两个中文普通话数据集(Aishell-1 和HKUST)上进行实验,结果表明,CLformer显著提升了模型性能,在Aishell-1上较基线提高0.3%,在HKUST上提高0.5%.

Abstract

The Transformer model demonstrates excellent performance in the task of automatic speech recognition(ASR),but there is still room for improvement in feature extraction.This study identifies two main issues with the model:first,it focuses on extracting global feature interactions,overlooking other useful features such as local feature interactions;second,it does not fully utilize low-level feature interactions.To address these issues and enhance the model's performance in ASR tasks,we propose a Convolutional Linear Mapping(CMLP)module to enhance local feature interactions and a Low-level Feature Fusion(LF)module to integrate high-level and low-level features.By integrating these modules,we construct the CLformer model.Experimental results on two Chinese Mandarin datasets(Aishell-1 and HKUST)demonstrate that CLformer significantly improves model performance:by 0.3%on Aishell-1 and 0.5%on HKUST compared to the baseline.This validates the effectiveness of our optimization strategy.

关键词

Transformer模型/自动语音识别/特征增强/局部特征/特征融合

Key words

Transformer model/automatic speech recognition/feature fusion/local feature/global feature

引用本文复制引用

基金项目

广东省普通高校重点实验室资助项目(2021KSYS008)

出版年

2024
佛山科学技术学院学报(自然科学版)
佛山科学技术学院

佛山科学技术学院学报(自然科学版)

影响因子:0.226
ISSN:1008-0171
参考文献量7
段落导航相关论文