首页|基于BERT-TENER的服装质量抽检通告命名实体识别

基于BERT-TENER的服装质量抽检通告命名实体识别

扫码查看
识别服装质量抽检通告中的实体信息,对于评估不同区域的服装质量状况以及制定宏观政策具有重要意义.针对质量抽检通告命名实体识别存在的长文本序列信息丢失、小类样本特征学习不全等问题,以注意力机制为核心,提出了基于BERT(bidirectional encoder representations from transformers)和 TENER(transformer encoder for NER)模型的领域命名实体识别模型.BERT-TENER模型通过预训练模型BERT获得字符的动态字向量;将字向量输入TENER模块中,基于注意力机制使得同样的字符拥有不同的学习过程,基于改进的Transformer模型进一步捕捉字符与字符之间的距离和方向信息,增强模型对不同长度、小类别文本内容的理解,并采用条件随机场模型获得每个字符对应的实体标签.在领域数据集上,BERT-TENER模型针对服装抽检领域的实体识别F1达到92.45%,相较传统方法有效提升了命名实体识别率,并且在长文本以及非均衡的实体类别中也表现出较好的性能.
Named Entity Recognition for Clothing Quality Sampling Notice Based on BERT-TENER
Recognizing entity information in clothing quality sampling notice is important for assessing the quality status of clothes in different regions as well as formulating macro policies.Aiming at the problems of loss of information for long text sequences,and incomplete feature learning of small class samples in named entity recognition for clothing quality sampling notice.With the focus on the attention mechanism,a domain named entity recognition model based on the BERT(bidirectional encoder representations from transformers)and TENER(transformer encoder for NER)model was proposed.The dynamic character vectors of characters were obtained by the pre-training model BERT.These character vectors were input into the TENER module,which made the same characters undergo different learning processes based on the attention mechanism.The distance and direction information between characters were further captured,enhancing the model's understanding of the text content of different lengths and small categories.The conditional random field model was used to obtain the entity label corresponding to each character.On the domain dataset,the entity recognition F1 of the BERT-TENER model for the clothing sampling domain reaches 92.45%.This model has not demonstrated applicability in other areas.The model effectively improves the named entity recognition rate compared with the traditional methods,and also shows better performance in long text as well as unbalanced entity categories.

named entity recognitionclothing quality sampling noticeBERT(bidirectional encoder representations from transformers)TENER(transformer encoder for NER)

陈进东、胡超、郝凌霄、曹丽娜

展开 >

北京信息科技大学经济管理学院,北京 100192

智能决策与大数据应用北京市国际科技合作基地,北京 100192

北京信息科技大学计算机学院,北京 100192

华北科技学院经济管理学院,廊坊 065201

展开 >

命名实体识别 服装质量抽检通告 BERT(Bidirectional encoder representations from transformers) TENER(transform-er encoder for NER)

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(34)