Research on Machine Translation for Markup Language
Compared with plain text translation tasks,the markup language translation is obstructed by low transla-tion quality caused by complex and diverse markup formats.This paper proposes a combined generalization-based markup language translation method.As for the format restoration of markup language,this paper proposes to measure its quality by tag position precision,accuracy,recall rate and F1 value.Compared with truncation-based,word alignment-based and existing generalization methods,the proposed method has significant improvement in BLEU,and the format restoration rate is close to 100%.