基于自然语言处理技术的数据治理体系研究及应用

扫码查看

原文链接

万方数据
维普

中文摘要：在自然语言处理技术中,中文分词模型计算时间长、学习能力有限是目前困扰学术界的问题,对此提出一种结合SACNN+CRF模型.该模型结合自注意力机制、卷积神经网络、CRF优势完成中文分词任务.最佳参数测试结果表明,SACNN+CRF模型的最佳隐藏数和最佳迭代次数分别为100个和200次.相较于BiSTM+CRF模型,SACNN+CRF模型的MAE、RMSE、MAPE三个指标分别提升了 32.98％、41.89％、36.58％.所提出的SACNN+CRF模型具有较高的运行效率,在中文分词任务中的应用具有较高的价值.

外文标题：Research and Application of Data Governance System Based on Natural Language Processing Technology

外文摘要：In natural language processing technology,the long computing time and limited learning ability of chinese word seg-mentation model are the problems perplexing the academic circles.A combined SACNN+CRF model is proposed,it combines the advantages of self attention mechanism,convolution neural network and CRF to complete the task of Chinese word segmen-tation.The best parameter test results show that the best hiding number and the best iteration times of SACNN+CRF model are 100 and 200.Compared with BiSTM+CRF model,MAE,RMSE and MAPE of SACNN+CRF model it increases by 32.98％,41.89％and 36.58％,respectively.The proposed SACNN+CRF model has high efficiency and high value in the ap-plication of chinese word segmentation task.

外文关键词：

chinese word segmentationself attentionconvolutional neural network

作者：

孔庆波、李文科

展开 >

作者单位：

贵州电网有限责任公司信息中心,贵州,贵阳 550002

关键词：

中文分词自注意力卷积神经网络

基金：

项目编号：

066700KK52170030

出版年：

2024

微型电脑应用

上海市微型电脑应用学会

微型电脑应用

CSTPCD

影响因子：0.359

ISSN：1007-757X

年,卷(期)：2024.40(2)

参考文献量10