首页|基于自然语言处理的建筑企业失信行为信息分类研究

基于自然语言处理的建筑企业失信行为信息分类研究

Research onthe Classification of Bad Credit Information in Construction Market Based on Natural Language Processing

扫码查看
为改善建筑信用管理中对信用信息的文档管理依赖人力劳动的现状,文章提出一种基于自然语言处理技术(NLP)的建筑企业失信行为信息文本分类方法.首先,基于Skip-Gram词向量模型利用已标注数据和大量无标注获取文本的词向量表示;其次,运用融入注意力机制(attention-mechanism)的双向长短期记忆网络模型(BiLSTM)对已标注数据进行特征提取与文本分类.结果表明:在小样本训练中,使用较大的语料库训练词向量模型可有效提高文本分类模型的分类效果,BiLSTM-Attention模型的分类性能优于对照模型,基于NLP的文本分类方法能够实现对建筑企业失信行为信息的快速自动分类.
In order to improve the status quo of relying on human labor for document management of credit in-formation in construction credit management,This paper proposed a text categorization method based on Natural Language Processing(NLP)for the information of construction enterprise's bad credit information.Firstly,the word vector represen-tation of the text was obtained based on Skip-Gram model using labeled data and a large number of unlabeled;secondly,the Bi-directional Long-Short Term Memory Network(BiLSTM),which incorporated the Attention-Mechanism,was used to perform feature extraction and text classification on the labeled data.The results showed that:in small-sample training,using a larger corpus to train the word vector model could effectively improve the classification performance of the text clas-sification model,the NLP-based text classification method could realize the fast and automatic classification of the informa-tion about the bad Credit information of construction enterprises.

bad credit informationadministrative penaltySkip-Gram word vectorAttention-Mechanismtext classification

张振森、任宇轩、曹吉昌

展开 >

青岛理工大学管理工程学院 山东青岛 266525

中国科学院大学 北京 100049

失信行为信息 行政处罚 Skip-Gram词向量 注意力机制 文本分类

2024

九江学院学报(自然科学版)
九江学院

九江学院学报(自然科学版)

影响因子:0.304
ISSN:
年,卷(期):2024.39(3)