In the current domain of natural language processing,attribute extraction techniques are confronted with issues,such as low accuracy and the challenge of obtaining large-scale training data.Addressing this issue,the study proposes a method for attribute extraction from Chinese herbal medicines based on the BERT-CRF framework.This approach transforms the attribute extraction task into a sequential labeling task,leveraging the rich semantic information provided by the pre-trained language model BERT and the context feature understanding capability of CRF to enhance the precision of attribute extraction.This research also constructs a dataset for attribute extraction from Chinese herbal medicines with book and web data,and applies the BERT-CRF attribute extraction method to publicly available datasets like MSRA and the dataset for Chinese herbal medicine attributes.The results demonstrate that the proposed model outperforms other sequential labeling models in precision,recall,and F1 score,thereby validates its effectiveness in the task of attribute extraction for Chinese herbal medicines.
关键词
自然语言处理/属性抽取/预训练语言模型/条件随机场
Key words
Natural language processing/Attribute extraction/Pre-trained language model/Conditional random field