Neural machine translation method integrating BERT's pre-trained language knowledge
[Objective]To study the problem that only fine-tuning method can not make full use of pre-trained language knowledge in neural machine translation task.[Methods]A neural machine translation method based on two-stage interactive fusion of pre-trained models is proposed.First,the multi-layer representation of BERT pre-trained model is extracted,then the mask knowledge matrix is constructed by using the multi-layer representation,and the pre-t raining knowledge contained in BERT is applied to the encoding word embedding layer of neural machine translation model.Second,the beneficial knowledge obtained from BERT multilayer representation is extracted by adaptive fusion module and interactively fused with neural machine translation model.[Results]Experimental results show that,compared with Transformer baseline model,the proposed method achieves an improvement of BLEU score of 1.41~4.20 in multiple neural machine translation tasks.Compared with other neural machine translation methods that integrate pre-training knowledge,the proposed method also secures a significant model performance improvement.[Conclusion]The neural machine translation method proposed herein,which combines pre-trained models with two-stage interaction,resolves the problem of catastrophic forgetting,reduces the difference between pre-trained models and neural machine translation models due to different training objectives,and can effectively use pre-trained language knowledge to improve the performance of neural machine translation models.
machine translationpre-trained language modelattention mechanismtransformer network model