基于偏正结构表示的加工命名实体识别方法
Named entity recognition for process technic based on subordinate structure
王素琴 1王钰珏 1石敏 1朱登明 2李兆歆2
作者信息
- 1. 华北电力大学控制与计算机工程学院,北京 102206
- 2. 中国农业科学院农业信息研究所,北京 100081
- 折叠
摘要
制造企业积累大量的零件加工经验多以文本形式存在,如何从文本中挖掘出高质量的零件加工知识是个尚待解决的问题.针对待识别实体存在的偏正结构特征,导致实体边界界定模糊的问题,提出一种多网络协调的中文命名实体识别方法.在BERT生成字向量的过程中,通过领域自适应方法,提高字向量对工艺实体的表征能力,同时,在BiLSTM-CRF模型中引入注意力机制和多门控制的混合专家网络捕获上下文特征与实体信息.实验表明,较于当前主流的命名实体识别模型,该文提出的方法对机械零件加工实体识别的F1值达到80.15%,取得优于其他模型的最好性能.
Abstract
Manufacturing enterprises accumulate a large amount of part processing experience mostly in the form of text.How to extract high-quality processing knowledge from the text is a problem yet to be solved.In response to the problem of the subordinate structure entities to be recognized that leads to the ambiguity of entity boundary defi-nition,a multi-network coordinated Chinese named entity recognition method was proposed.In the process of word vector generation by BERT,the characterization ability of word vectors for process entities was improved by domain self-adaptive methods,and at the same time,attention mechanism and hybrid expert network with multi-gate control were introduced in the BiLSTM-CRF model to capture contextual features and entity information.The experiments showed that the proposed method achieved the best performance over other models by achieving the F1 value of 80.15%for the recognition of machined entities of mechanical parts compared with the current mainstream named entity recognition models.
关键词
中文命名实体识别/机械零件加工/多门控制的混合专家网络/领域自适应Key words
Chinese named entity recognition/manufacturing processes/hybrid expert network with multi-gate/do-main self-adaptive引用本文复制引用
基金项目
国家重点研发计划(2020YFB1710400)
出版年
2024