Objective To construct a scientific and standardized intelligent machine translation model for TCM ancient books;To accurately translate ancient books into Chinese or even English;To provide reference for clinical medical learning and TCM dissemination.Methods Firstly,machine translation of TCM ancient books was studied,and the initial experiments were conducted to construct a parallel corpus dataset at the sentence level,including 969,754 parallel sentence pairs;secondly,the Seq2Seq model of the attention mechanism(Seq2Seq+Attention)was created,and the Seq2Seq pretraining model(Pre-Training+Seq2Seq)was used to train 800,000 ancient poems;lastly,the Seq2Seq model was constructed to train 800 000 ancient poems;finally,experiments were conducted on the constructed dataset,and BLEU1,BLEU2 and F1 were used as evaluation indexes to verify the effectiveness of the model and the feasibility of further optimization.Results The F1 value of the experimentally constructed Pre-Training+Seq2Seq model reached 65.72%.Conclusion The Pre-Training+Seq2Seq model is effective and provides ideas for intelligent machine translation of TCM ancient books.
关键词
中医古籍/文言文/语料库/文本对齐/机器翻译
Key words
TCM ancient books/literary text/corpus/text alignment/machine translation