计算机科学2024,Vol.51Issue(12) :30-36.DOI:10.11896/jsjkx.240300025

基于深度对比孪生网络的事件辨重方法

Deep Contrastive Siamese Network Based Repeated Event Identification

李子琛 易修文 陈顺 张钧波 李天瑞
计算机科学2024,Vol.51Issue(12) :30-36.DOI:10.11896/jsjkx.240300025

基于深度对比孪生网络的事件辨重方法

Deep Contrastive Siamese Network Based Repeated Event Identification

李子琛 1易修文 2陈顺 3张钧波 3李天瑞1
扫码查看

作者信息

  • 1. 西南交通大学计算机与人工智能学院 成都 611756
  • 2. 北京京东智能城市大数据研究院 北京 100176;京东城市(北京)数字科技有限公司 北京 100176
  • 3. 西南交通大学计算机与人工智能学院 成都 611756;北京京东智能城市大数据研究院 北京 100176;京东城市(北京)数字科技有限公司 北京 100176
  • 折叠

摘要

在中国,市民可以通过拨打12345市民热线,向政府报告生活中遇到的问题并寻求帮助.然而,有许多重复的事件被多次上报,这给负责事件分派的工作人员带来了很大的压力,也会导致事件的处置效率变低,浪费社会公共资源.对重复事件的判断需要精确分析文本语义和上下文关系,为了解决这个问题,文中提出了一种基于深度对比孪生网络的事件辨重方法,通过评估两个事件的描述文本之间的相似性,辨别出具有相同诉求的事件.首先通过召回和过滤的方法来减少候选事件的数量;然后通过对比学习构造任务,微调预训练的BERT模型,学习易于辨识的事件描述语义表征;最后引入事件标题作为上下文信息,并通过带有分类器的孪生网络来识别重复事件.在南通市12345事件数据集上进行了实验,结果表明,该方法在各项评估指标上均优于基线方法,特别是在与辨重任务场景相关的F0.5分数上,能够有效地辨别重复事件,提高事件处置的效率.

Abstract

In China,citizens can report issues they encounter in daily life to the government and seek assistance by calling the 12345 citizen hotline.However,many events are reported multiple times,which places significant pressure on the staffs responsi-ble for event allocation,resulting in low efficiency of event disposal and waste of public resources.Identifying repeated events re-quires precise analysis of textual semantics and contextual relationships.To address this problem,this paper proposes an event repetition identification method based on a deep contrastive siamese network.By evaluating the similarity between the descriptions of events,the method identifies events with the same demands.First,it reduces the number of events through retrieval and filte-ring.Then,it fine-tunes a pre-trained BERT model through contrastive learning to learn distinct semantic representations of event descriptions.Finally,the event title is introduced as contextual information,and a siamese network with a classifier is used to identify repeated events.Experimental results on the 12345 event dataset of Nantong demonstrate that the proposed method out-performs baseline methods across various evaluation metrics,particularly in the F0.5 score,which is relevant to the repetition task scenario.The proposed method can effectively identify repeated events and improve the efficiency of event handling.

关键词

12345热线/重复事件识别/对比学习/孪生网络/城市计算

Key words

12345 hotline/Repeated event dispatch/Contrastive learning/Siamese network/Urban computing

引用本文复制引用

出版年

2024
计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCDCSCD北大核心
影响因子:0.944
ISSN:1002-137X
段落导航相关论文