首页|机械装配工艺文本的实体关系抽取方法研究

机械装配工艺文本的实体关系抽取方法研究

扫码查看
机械装配过程常常需要人工阅读并理解大量装配工艺文本,从而耗费大量时间,并且由于装配工艺文本书写人员和装配人员能力的差异,可能会导致装配人员错误理解装配文本,产生零部件错装、漏装等问题;机械装配矩阵以矩阵形式存储零部件的装配实体关系,可以直接、有效表达装配关系,不仅易于工人理解装配关系,也便于计算机识别,可以显著提高装配效率。自然语言处理作为研究计算机理解人类语言的工具,在根据装配文本生成装配矩阵的任务中可以起到关键的作用;文章采用自然语言处理的方法,对装配文本进行断句、分词、词性标注等文本预处理操作,采用机械装配名词语料库辅助以提高对装配零件的分词、词性标注时的准确率;用语法依存关系分析和语法模板匹配两种方法生成每个句子的主语、谓语、宾语三元组,其中采用机械装配名词语料库进行匹配,以判断其中的装配零部件名;之后提取出主语及宾语都为装配零件的三元组作为一个装配关系,对其进行去除冗余词、实体对齐等后处理操作;最后根据零部件数量组成一个空矩阵,将装配关系填入接触矩阵,并根据零部件类型判断生成装配关系的接触-连接矩阵。
Research on Entity Relation Extraction Method of Mechanical Assembly Process Text
Mechanical assembly process often requires manual reading and understanding of a large number of assembly process texts,which consumes a lot of time.Moreover,due to the differences in the abilities of assembly process text writers and assemblers,it may cause assemblers to misunderstand assembly texts,resulting in problems such as wrong assembly and missing assembly of parts.Mechanical assembly matrix stores the assembly entity relationship of parts in the form of matrix,which can directly and effec-tively express the assembly relationship.It is not only easy for assemblers to understand the assembly relationship,but also easy for computer recognition,which can significantly improve the assembly efficiency.Natural language processing can play a key role in gen-erating assembly matrix from assembly text as a method for computer understanding human language.In this paper,the natural lan-guage processing method is used to preprocess the assembly texts,such as sentence breaking,word segmentation and part of speech tagging.The mechanical assembly noun corpus is used to improve the accuracy of word segmentation and part of speech tagging of as-sembly parts;Then,the"subject-predicate-object"triplet of each sentence is generated by two methods:the syntax dependency a-nalysis and syntax template matching.The mechanical assembly noun corpus is used to match the assembly part names;After that,the triplet whose subject and object are assembly parts is extracted as an assembly relationship,and the post-processing operations such as removing redundant words and entity alignment are carried out;Finally,an empty matrix is formed according to the number of parts,the assembly relationship is filled into the contact matrix,and the assembly relationship matrix is generated according to the type of parts.

assembly process textentity relationshipnatural language processingpart of speech taggingtripletassembly re-lation matrix

尹昱东、王保建、李珂嘉、王紫平、刘洁

展开 >

西安交通大学机械工程学院,西安 710049

装配工艺文本 实体关系 自然语言处理 词性标注 三元组 装配关系矩阵

陕西省自然科学基础研究计划陕西省自然科学基础研究计划西安交通大学本科实验实践与创新创业教育教学改革研究专项(2022)

2021M-1692023-JC-YB-47722SJZX10

2024

计算机测量与控制
中国计算机自动测量与控制技术协会

计算机测量与控制

CSTPCD
影响因子:0.546
ISSN:1671-4598
年,卷(期):2024.32(6)
  • 5