AIGC Empowering the Revitalization of Ancient Books on Traditional Chinese Medicine:Building the Huang-Di Large Language Model
There are a large number of databases related to ancient books on Traditional Chinese Medicine(TCM),but digital research in this area is still dominated by shallow knowledge services that involve document scanning and collation,browsing and retrieval.The development of generative AI provides new opportunities for the digital research on ancient TCM books.Based on the Ziya-LaMA-13B-V1 open-source model,this article designs a generative dialogue large language model for ancient TCM works through the whole process of continuous pre-training,supervised fine-tuning,and DPO optimization,and verifies its excellent performance through automatic and manual evaluation.The automatic evaluation results show that the loss function for training converges successfully,and the values of BLEU and ROUGE indicators are relatively low under each dialogue category,which indirectly reflects the strong domain creativity of the model.The manual evaluation results show that the model significantly outperforms the existing two types of models in the TCM vertical in terms of knowledge Q&A,better than Tongyi Qianwen,and in some categories such as disease prevention,health care,its response ability is slightly weaker than ChatGPT(gpt-4).By breaking the habitual digital research pattern of ancient TCM books,this study achieves an in-depth integration and utilization of ancient book resources,and fulfills the diversified knowledge services such as ancient book knowledge Q&A,TCM consultation,and health care support.
ancient books on traditional chinese medicinedigitizationlarge language modelknowledge serviceAIGC