首页|面向网络文章的质量检测模型

面向网络文章的质量检测模型

扫码查看
互联网中存在大量良莠不齐的文章,严重破坏网络生态,为构建绿色网络空间,网络文章质量检测是一项重要且崭新的工作.基于腾讯数据集,从文章组织特征、书写特征和语义特征三个维度对文章质量检测展开研究,构建了组织子网、特征子网和文本子网三个子网络,扩展了三种注意力模式和四种Transformer模式,其中采用CNN+BiGRU、Attention+ACNN、Transformer模型Ⅰ使三个子网络的分类准确率分别达到80.6%、87%和92.9%,并使三个子网的组合模型OFT模型框架的分类准确率达到 93.3%.此外,针对文本数据采用两种方式获取BERT词向量,最终OFT的准确率达到94.2%.实验结果表明,该模型效果优于现有模型.
CONTENT QUALITY DETECTION MODEL FOR WEB ARTICLES
The existence of a large number of articles of mixed quality in the Internet has seriously damaged the network ecology.In order to build a green cyberspace,online article quality detection is an important and new task.Based on the Tencent dataset,we investigated article quality detection in three dimensions:article organization features,writing features and semantic features,and three sub-networks:organization sub-network,feature sub-network and text sub-network were built.Three attention models and four Transformer models were extended,in which CNN+BiGRU,Attention+ACNN,Transformer model I were used to make the classification accuracy of the three sub-networks reach 80.6%,87%,and 92.9%,respectively.The classification accuracy of the combined model OFT model framework of the three subnetworks reaches 93.3%.In addition,two methods were used to obtain BERT word vectors for text data,the final OFT's accuracy reaches 94.2%.The experimental results show that the proposed model outperforms the existing methods.

Content quality inspectionFour modes of transformerThree modes of attentionOFT model framework

王凯楠、林欣欣、王薇

展开 >

长春大学网络空间安全学院 吉林 长春 130022

长春大学计算机科学技术学院 吉林 长春 130022

内容质量检测 四种Transformer模式 三种注意力模式 OFT模型框架

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(12)