传统中医本草文献含有丰富的中医知识,是中医理论研究的重要载体.为了更好地挖掘中医本草知识,精准地实现中医本草文献命名实体识别任务,提出了 一种基于特征增强的Bert-BiGRU-CRF中医本草命名实体识别模型,使用特征融合器拼接Bert生成的词向量与实体特征作为输入,以双向门控循环单元(bi-directional gated recurrent unit,BiGRU)为特征提取器,以条件随机场(conditional random fields,CRF)进行标签预测,通过特征增强的方法更好地识别中医本草的药名、药性、药味、归经等实体及其边界信息,完成中医本草命名实体任务.在中医本草数据集上的实验结果表明,融入特征的模型F1值达到了90.54%,证明了所提出的方法可以更好地提高中医本草命名实体识别精度.
Research on named entity recognition of traditional Chinese medicine based on feature enhancement
Traditional Chinese medicine(TCM)herbal literature contains rich knowledge of TCM and is an important carrier of theoretical research in TCM.In order to better explore the knowledge of TCM herbal literature and accurately achieve the task of named entity recognition in TCM herbal literature,a Bert-BiGRU-CRF named entity recognition model for TCM herbal literature based on feature enhancement is proposed,which uses a feature fusion tool to concatenate the word vector generated by Bert with entity features as input,With Bi directional gated recurrent unit(BiGRU)as the feature extractor and Conditional random field(CRF)as the tag prediction,the method of feature enhancement is used to better identify the entities and their boundary information such as the name,property,taste and meridian tropism of TCM herbs,and complete the task of naming entities of TCM herbs.The experimental results on the dataset of TCM herbs show that the F1 value of the model incorporating features reaches 90.54%,proving that the proposed method can better improve the accuracy of named entity recognition in TCM herbs.
named entity recognitionChinese herbal medicinefeature enhancementdictionary in-formation