首页|旅游自动问答系统中多任务问句分类研究

旅游自动问答系统中多任务问句分类研究

扫码查看
目前旅游产业信息化建设需要构建旅游自动问答系统,其中问句分类是问答系统的重要组成部分,传统问句类别体系角度单一,且传统分类模型对不平衡的问句数据集表现欠佳。针对这一问题,该文从问题主题和问句答案类型两个角度构建了旅游领域的问句类别体系架构,并提出多任务问句分类模型MT-Bert,在BERT上进行多任务训练,并加入自注意力机制,使用Softmax分类器,并设计了多任务融合损失函数。在山西旅游数据集的结果表明,MT-Bert在两种类别体系的微平均F1值分别为97。6%、91。7%,且避免了非平衡数据的预测失败问题,可以有效处理非平衡数据。
CLASSIFICATION OF MULTI-TASK QUESTIONS IN THE AUTOMATIC QUESTION-ANSWER SYSTEM FOR TOURISM
At present,the tourism industry information construction needs to construct the tourism automatic question and answer system,in which the questions classification is a significant part of the question and answer system,the traditional question category system angle is single,and the traditional classification model is not good for the unbalanced question data set.To solve the above situation,this paper constructs the architecture of question category in tourism field from two angles:question theme and question answer type.And it proposed multi-task question classification model MT-Bert,conducted multi-task training on Bert,added self-attention mechanism,used Softmax classifier,and designed multi-task fusion loss function.The results on tourism Data Set in Shanxi show that the micro average F1 values of MT-Bert in the two kinds of systems are 97.6%and 91.7%respectively,and the prediction failure of unbalanced data is avoided,so the unbalanced data can be processed effectively.

Tourism question and answer(QA)Question classificationClassification systemBRETSelf-attentionMulti-task

陈千、冯子珍、王素格、郭鑫

展开 >

山西大学计算机与信息技术学院 山西太原 030006

山西大学计算智能与中文信息处理教育部重点实验室 山西太原 030006

旅游问答 问句分类 分类体系 BERT 自注意力 多任务

山西省重点研发计划项目山西省应用基础研究计划项目山西省应用基础研究计划项目国家自然科学基金项目国家自然科学基金项目

201803D421024201901D111032201701D2211016150228861403238

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(1)
  • 8