In the study field of Tibetan medicine,it is essential to accurately extract the medical entities and their relationships in medicine texts and structure them into triples,which is crucial for constructing knowledge graphs.However,the existing methods,which mainly rely on general pre-trained models to process Tibetan med-icine texts,often overlook the specialized terminology,leading to limitations in generalization and robustness.This paper propose a model based on the encoder-decoder architecture,enhanced with a pointer mechanism,to overcome these shortcomings.In the encoding phase,the model utilizes BERT and GloVe to generate rich embed-dings,significantly improving the understanding of medical terms.In the decoding phase,a Transformer decoder is combined with a pointer mechanism to produce structured entity and relationship information directly.The training process incorporates the concept of similar spans to refine the model's entity recognition capabilities.Ex-periments on the CMeIE-V2 and TibetanAI_TMDisRE_v1.0 datasets show that this model outperforms advanced baselines in performance and robustness.
关键词
编码器-解码器架构/指针机制/藏医药文本/实体关系联合抽取
Key words
encoder-decoder architecture/pointer mechanism/Tibetan medicine texts/joint entity and relation extraction