长江信息通信2024,Vol.37Issue(1) :55-60.DOI:10.20153/j.issn.2096-9759.2024.01.016

基于改进SVD++算法和K-means++算法的小文件合并方案

A Small File Merging Scheme based on Improved SVD++Algorithm and K-means++Algorithm

张广龙 尹铁源
长江信息通信2024,Vol.37Issue(1) :55-60.DOI:10.20153/j.issn.2096-9759.2024.01.016

基于改进SVD++算法和K-means++算法的小文件合并方案

A Small File Merging Scheme based on Improved SVD++Algorithm and K-means++Algorithm

张广龙 1尹铁源1
扫码查看

作者信息

  • 1. 沈阳工业大学信息科学与工程学院,辽宁 沈阳 110020
  • 折叠

摘要

文章提出了一种基于改进SVD++算法和K-means++算法的小文件合并方案.通过引入自适应学习率函数和基于并行分组的SVD++算法,优化了小文件的合并过程,以提高Hadoop存储小文件的效率.同时,利用K-means++算法对合并后的文件进行聚类,优化了数据存储方式,降低了存储空间的浪费.在Hadoop平台上进行的实验表明,该方案在保持数据处理准确性和稳定性的同时,显著提升了 Hadoop存储与处理小文件的性能.

Abstract

This paper proposes a small file merging scheme based on the improved SVD++al-gorithm and K-means++algorithm.By introducing an adaptive learning rate function and the parallel grouping based on the SVD++algorithm,the file merging process is optimized to en-hance the efficiency of storing small files in Hadoop.Additionally,the K-means++algorithm is employed to cluster the merged files and optimize the data storage method to reduce storage space wastage.Experiments conducted on the Hadoop platform demonstrate that the proposed scheme significantly improves the performance of storing and processing small files while main-taining data processing accuracy and stability.

关键词

Hadoop/小文件合并/SVD++算法/K-means++算法

Key words

Hadoop/Small file merging/SVD++algorithm/K-means++algorithm

引用本文复制引用

出版年

2024
长江信息通信
湖北通信服务公司

长江信息通信

影响因子:0.338
ISSN:2096-9759
参考文献量8
段落导航相关论文