Neural Networks2022,Vol.14812.DOI:10.1016/j.neunet.2022.01.016

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing

Mi, Chenggang Xie, Lei Zhang, Yanning
Neural Networks2022,Vol.14812.DOI:10.1016/j.neunet.2022.01.016

Improving data augmentation for low resource speech-to-text translation with diverse paraphrasing

Mi, Chenggang 1Xie, Lei 2Zhang, Yanning2
扫码查看

作者信息

  • 1. Foreign Language & Literature Inst,Xian Int Studies Univ
  • 2. Sch Comp Sci,Northwestern Polytech Univ
  • 折叠

Abstract

High quality end-to-end speech translation model relies on a large scale of speech-to-text training data, which is usually scarce or even unavailable for some low-resource language pairs. To overcome this, we propose a target-side data augmentation method for low-resource language speech translation. In particular, we first generate large-scale target-side paraphrases based on a paraphrase generation model which incorporates several statistical machine translation (SMT) features and the commonly used recurrent neural network (RNN) feature. Then, a filtering model which consists of semantic similarity and speech-word pair co-occurrence was proposed to select the highest scoring source speech-target paraphrase pairs from candidates. Experimental results on English, Arabic, German, Latvian, Estonian, Slovenian and Swedish paraphrase generation show that the proposed method achieves significant and consistent improvements over several strong baseline models on PPDB datasets (http://paraphrase. org/). To introduce the results of paraphrase generation into the low-resource speech translation, we propose two strategies: audio-text pairs recombination and multiple references training. Experimental results show that the speech translation models trained on new audio-text datasets which combines the paraphrase generation results lead to substantial improvements over baselines, especially on low-resource languages. (C)& nbsp;2022 Elsevier Ltd. All rights reserved.

Key words

Data augmentation/Speech translation/Paraphrasing

引用本文复制引用

出版年

2022
Neural Networks

Neural Networks

EISCI
ISSN:0893-6080
被引量3
参考文献量53
段落导航相关论文