首页|基于决策树模型的非结构化云数据分块存储方法

基于决策树模型的非结构化云数据分块存储方法

扫码查看
以降低非结构化云数据存储压力、提升非结构化云数据存储能力为目的,研究基于决策树模型的非结构化云数据分块存储方法.采用数据清洗、数据选择、数据变换、归一化处理等过程预处理非结构化云数据,降低非结构化云数据维度.采用选择随机性特征分析方法,明确预处理后非结构化云数据间关联维度分布特征量与相似度的相关性,并以此为基础,通过样本扩展和密度融合的方法提取非结构化云数据特征.采用改进决策树算法对提取的非结构化云数据特征集进行模糊分类处理,将各类别非结构化云数据划分为相同规格的数据块,通过范德蒙矩阵编码、解码处理,在多个适配度较高的节点上完成非结构化云数据分块存储.实验结果表明,该方法有效计算比值达到0.8,具有较优的存储能力;压缩因子均值达到6.7,可显著降低非结构化云数据存储压力.
Unstructured Cloud Data Block Storage Method Based on Decision Tree Model
In order to reduce the storage pressure of unstructured cloud data and improve the storage capacity of unstructured cloud data,the unstructured cloud data block storage method based on decision tree model is studied.Data cleaning,data selec-tion,data transformation and normalization are used to preprocess unstructured cloud data to reduce the dimension of unstruc-tured cloud data.The method of random feature analysis is adopted to clarify the correlation between the distribution feature quantity of correlation dimensions and the similarity of the unstructured cloud data after preprocessing.Based on this,the fea-tures of unstructured cloud data are extracted by sample expansion and density fusion.The improved decision tree algorithm is used to perform fuzzy classification on the extracted feature set of unstructured cloud data.All kinds of unstructured cloud data are divided into data blocks of the same specification.Through Vandermonde matrix encoding and decoding,unstructured cloud data are stored in blocks on multiple nodes with higher fitness.The experimental results show that the effective calculation ratio of this method reaches 0.8,and it has better storage capacity.The mean compression factor reaches 6.7,which can significant-ly reduce the storage pressure of unstructured cloud data.

decision tree modelunstructuredcloud datablock storagepreprocessVandermonde matrix

万磊

展开 >

上海邮电设计咨询研究院有限公司,北京 100070

决策树模型 非结构化 云数据 分块存储 预处理 范德蒙矩阵

2024

微型电脑应用
上海市微型电脑应用学会

微型电脑应用

CSTPCD
影响因子:0.359
ISSN:1007-757X
年,卷(期):2024.40(9)