基于深度学习的非结构化大数据密度聚类仿真

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：常规的非结构化大数据密度聚类方法耗时长,且易出现数据密度分配错误的情况,影响数据聚类精度.因此,提出一种基于深度学习的非结构化大数据快速密度聚类方法.采用数据密度函数求解每个非结构化大数据密度值,使用邻近搜索技术找出各簇最佳中心,选用Alex Net网络建立数据聚类学习框架,利用映射方式提取数据特征矢量,通过损失函数得出伪标签并作为反向传播依据.为了提升模型聚类速度及精度,引入小批量梯度下降优化聚类模型参数,实现非结构化大数据密度聚类.实验结果表明,所提方法能够使密度相似数据紧密、密度相差较大数据稀疏,令数据密度聚类效果良好.

外文标题：Deep Learning-Based Density Clustering Simulation of Unstructured Big Data

外文摘要：Conventionally,traditional methods are time-consuming and prone to incorrect data density allocation,which affects the data clustering accuracy.Therefore,this paper proposed a fast density clustering method for non-structural big data based on deep learning.Firstly,the data density function was used to calculate all density values of unstructured big data.Secondly,the proximity search technology was adopted to find the best center of each cluster.Then,the Alex Net network was used to construct a learning framework for data clustering.Meanwhile,data feature vectors were extracted by mapping.Thirdly,pseudo labels were obtained by loss function as a basis for backpropaga-tion.In order to improve the clustering speed and accuracy of the model,small-lot gradient descent was used to opti-mize the model parameter,thus achieving the non-structural big data density clustering.Experimental results show that the proposed method can make the data with similar density integrate more closely with each other and make the data with large density differences sparse,so it has good data density clustering effect.

外文关键词：

Deep learningNon-structural big dataData densityPseudo label

作者：

胡涛、王中杰、张连明、陈晓锁

展开 >

作者单位：

湖南交通工程学院电气与信息工程学院,湖南衡阳 421001

湖南师范大学信息科学与工程学院,湖南长沙 410081

关键词：

深度学习非结构化大数据数据密度伪标签

基金：

湖南省教育厅教学改革研究项目湖南省教育厅科学研究重点项目

项目编号：

HNJG-2021-127522A0056

出版年：

2024

计算机仿真

中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD

影响因子：0.518

ISSN：1006-9348

年,卷(期)：2024.41(5)

参考文献量13