A Chinese-Named Entity Recognition Method Based on Multi-Feature Information Fusion and a Self-Attention Mechanism
Due to the inherent characteristics of industrial domain data,such as its unstructured format,domain-specific nature,and the scarcity of available data,the application of traditional Chinese-Named Entity Recognition(NER)techniques in industrial contexts often leads to suboptimal results.In response,this paper introduces a novel self-attention network algorithm,leveraging automotive industry data,that synergizes label semantics with glyph and phonetic elements.This model employs a self-attention mechanism to discern long-distance textual dependencies and integrates these with word segmentation at the character level.The model notably enhances character and word boundary recognition by fusing these elements with the context of label semantics.This approach effectively mitigates the common ambiguities in word boundary segmentation and the complexities associated with phrase contextual dependencies.Extensive testing on the MSRA and Weibo datasets,and a bespoke industrial maintenance docu-ment dataset demonstrates the method's efficacy.The results reveal a significant improvement in entity recognition accuracy,with specific enhancements in industrial scenario applications,particularly in the automotive parts sector datasets.