基于MacBERT与对抗训练的机器阅读理解模型

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：机器阅读理解旨在让机器像人类一样理解自然语言文本,并据此进行问答任务.近年来,随着深度学习和大规模数据集的发展,机器阅读理解引起了广泛关注,但是在实际应用中输入的问题通常包含各种噪声和干扰,这些噪声和干扰会影响模型的预测结果.为了提高模型的泛化能力和鲁棒性,提出一种基于掩码校正的来自Transformer的双向编码器表示(MacBERT)与对抗训练(AT)的机器阅读理解模型.首先利用MacBERT对输入的问题和文本进行词嵌入转化为向量表示;然后根据原始样本反向传播的梯度变化在原始词向量上添加微小扰动生成对抗样本;最后将原始样本和对抗样本输入双向长短期记忆(BiLSTM)网络进一步提取文本的上下文特征,输出预测答案.实验结果表明,该模型在简体中文数据集CMRC2018上的F1值和精准匹配(EM)值分别较基线模型提高了 1.39和3.85个百分点,在繁体中文数据集DRCD上的F1值和EM值分别较基线模型提高了1.22和1.71个百分点,在英文数据集SQuADv1.1上的F1值和EM值分别较基线模型提高了 2.86和1.85个百分点,优于已有的大部分机器阅读理解模型,并且在真实问答结果上与基线模型进行对比,结果验证了该模型具有更强的鲁棒性和泛化能力,在输入的问题存在噪声的情况下性能更好.

外文标题：Machine Reading Comprehension Model Based on MacBERT and Adversarial Training

外文摘要：Machine reading comprehension is designed to allow machines to understand natural language texts,resembling humans,and perform question-answering tasks accordingly.In recent years,owing to the development of deep learning and large-scale datasets,machine reading comprehension has received widespread attention.However,input problems in practical applications typically involve various noises and interferences,which affect the prediction results of a model.To improve the generalizability and robustness of a model,a machine reading comprehension model based on Masked language modeling as correction Bidirectional Encoder Representations from Transformers(MacBERT)and Adversarial Training(AT)is proposed.First,MacBERT is used to convert input questions and texts into word embeddings and vector representations.Subsequently,a small perturbation is added to the original word vector based on the gradient change of the original sample backpropagation to generate an adversarial sample.Finally,the original and adversarial samples are input into a Bidirectional Long Short-Term Memory(BiLSTM)network to further extract the contextual features of the text and output the predicted answer.Experimental results show that the F1 and Exact Matching(EM)values of this model on the simplified Chinese dataset CMRC2018 improve by 1.39 and 3.85 percentage points,respectively,compared with those of the baseline model.Meanwhile,the F1 and EM values on the traditional Chinese dataset DRCD improve by 1.22 and 1.71 percentage points,respectively,compared with those of the baseline model.Moreover,the F1 and EM values on the English dataset SQuADv1.1 improve by 2.86 and 1.85 percentage points,respectively,compared with those of the baseline model.The experimental results are better than those of most existing machine reading comprehension models.Based on actual question-answering results,the proposed model outperforms the baseline model in terms of robustness and generalizability;additionally,it performs better when the input problems contain noise.

外文关键词：

machine reading comprehensionAdversarial Training(AT)pre-trained modelMasked language

作者：

周昭辰、方清茂、吴晓红、胡平、何小海

展开 >

作者单位：

四川大学电子信息学院,四川成都 610065

四川省中医药科学院,四川成都 610041

关键词：

机器阅读理解对抗训练预训练模型掩码校正的来自Transformer的双向编码器表示双向长短期记忆网络

基金：

成都市重大科技应用示范项目

项目编号：

2019-YF09-00120-SN

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0068121

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(5)

参考文献量29