基于TextCNN的邮政快递业申诉文本事件分类
TextCNN based appeal text event classification for postal express delivery industry
宁艺博 1陈景霞 1张鹏伟 1王梅嘉1
作者信息
- 1. 陕西科技大学,陕西 西安 710021
- 折叠
摘要
为解决邮政安全监管部门在对大量申诉事件原因进行分类汇总时耗时耗力、效率低下等问题,提出应用Word2vec和TextCNN模型,实现对大量快递申诉文本事件进行申诉原因自动分类.首先对自采集的申诉文本做预处理,申诉原因分为延误、投递、丢失短少、损毁、其他共五种类型,再使用Word2vec进行词向量的转换,构建TextCNN模型,对其进行训练得到申诉文本的分类模型.在真实数据上的实验结果表明,该方法能够对申诉文本进行有效分类,准确率达到94.05%,召回率93.03%,F1值0.9325.
Abstract
To solve the problems of time-consuming,labor-intensive,and inefficient classification and summary of the causes of a large number of appeal incidents by postal security regulatory authorities,a Word2vec and TextCNN combined method is proposed to achieve automatic classification of appeal reasons for a large number of express delivery industry appeal texts.Firstly,the self-collected appeal text is preprocessed and divided into five types:delay,delivery,loss or shortage,damage,and others.Then,Word2vec is used to convert the text into word vectors,and TextCNN model is constructed and trained to obtain a classification model for the appeal text.The experimental results on real data show that this method can effectively classify appeal texts,with an accuracy of 94.05%,a recall rate of 93.03%,and an F1 value of 0.9325.
关键词
快递业申诉事件/文本分类/Word2vec/TextCNNKey words
appeal events in the express delivery industry/text classification/Word2vec/TextCNN引用本文复制引用
基金项目
国家自然科学基金(61806118)
陕西科技大学科研启动基金(2020BJ-30)
陕西省教育厅科学研究计划项目(22JK0303)
出版年
2023