首页|教育领域下多维度特征命名实体识别方法

教育领域下多维度特征命名实体识别方法

扫码查看
信息技术的发展与进步促使"互联网+教育"成为目前教育领域的研究热点,教育教学的各个环节都在向智能化的方向发展。中学数学的命名实体识别(NER)任务的研究,可为后续构建中学数学学科知识图谱及自动问答等任务奠定基础,进而满足中学生个性化知识获取的需求,助力新型智能化教育体系的构建。目前中学数学知识语义复杂,其NER和研究数据较少,且在当前主流模型特征提取任务中容易忽略掉部分局部特征。为解决该领域的实体识别困难问题,以自建的中学数学知识语料库为研究对象,提出一种融合多头注意力的多维度特征NER方法。该方法首先采用BERT进行文本表征预训练得到词向量,接着引入对抗训练对每个嵌入向量进行扰动,将得到的对抗样本和嵌入向量传送到多维度特征提取层进行特征提取,再将输出的特征进行拼接,通过多头注意力机制进行动态融合,最终经过条件随机场(CRF)修正后输出。实验结果表明,该方法在自建Educ数据集上的识别准确率、召回率以及F1值分别达到96。68%、97。71%和97。19%,证明了该方法在中学数学知识实体识别上的有效性。
Multidimensional Feature Named Entity Recognition Method in Education Domain
The development and progress of information technology have resulted in extensive investigations into"Internet+Education"in the field of education,and all aspects of education and teaching are being developed in the direction of intelligence.The study of Named Entity Recognition(NER)in secondary school mathematics can provide a foundation for the subsequent construction of secondary school mathematics knowledge mapping and automatic question-and-answer tasks to fulfill the demands of secondary school students for personalized knowledge acquisition and facilitate the construction of a new intelligent education system.Currently,owing to the semantic complexity of secondary school mathematics knowledge,its NER and research data are insufficient,and the current mainstream model for feature extraction disregards some local features.To solve the challenges of entity recognition in this field,a multidimensional feature NER method incorporating multihead attention is proposed using a self-constructed corpus of secondary school mathematics knowledge.First,the method adopts Bidirectional Encoder Representations from Transformers(BERT)for pre-training text representations to obtain word vectors.Subsequently,this method introduces adversarial training to perturb each embedding vector and then transmits the obtained adversarial samples and embedding vectors to the multidimensional feature extraction layer for feature extraction.Next,it splices the output features,dynamically fuses them via the multihead attention mechanism,and finally outputs them after correction by a Conditional Random Field(CRF).Experimental results show that the accuracy,recall,and F1 value of this method for recognizing the self-constructed Educ dataset are 96.68%,97.71%,and 97.19%,respectively,thus demonstrating its effectiveness in recognizing mathematical knowledge entities in secondary schools.

Named Entity Recognition(NER)educational domainadversarial trainingmultidimensional feature extractionmulti-head attention mechanism

任义、苏博、袁帅

展开 >

沈阳建筑大学计算机科学与工程学院,辽宁沈阳 110168

命名实体识别 教育领域 对抗训练 多维度特征提取 多头注意力机制

国家自然科学基金辽宁省教育厅基金辽宁省教育厅基金

62073227LJKZ0581LJKZ0584

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(10)
  • 10