首页|基于Ceph存储的数据均衡分布算法

基于Ceph存储的数据均衡分布算法

扫码查看
针对Ceph分布式存储系统中可扩展哈希下的受控复制(Controlled Replication Under Scalable Hashing,CRUSH)数据分布算法导致设备间存储数据容量之差达到40%,进而在数据量大、高并发情况下"热点"成为系统性能瓶颈的问题,本文对CRUSH算法进行深入研究,设计并实现了 Writing_Balance算法来对数据分布进行性能优化,以达到消除"热点"所导致的负载失衡以及磁盘利用率过高的问题.通过实验发现,Writing_Balance算法可使"热点"的PG数量分布优化率较之前提升4.4%;磁盘利用率稳定性提高了 3%左右;并且在较小输入key空间下对于数据整体均衡度优化也有明显的提升.
A data balanced distribution algorithm based on Ceph storage
The Controlled Replication Under Scalable Hashing(CRUSH) data distribution algorithm in Ceph distributed storage system causes the difference of storage data capacity between devices to reach 40%, and the so-called "hot spot"becomes the bottleneck of system performance in the case of large data volume and high concurrency. In this paper,CRUSH algorithm is deeply studied, and Writing is designed and implemented Writing_Balance algorithm is used to optimize the performance of data distribution to eliminate the load imbalance caused by "hot spotst" and the high disk utilization. Writing_Balance algorithm is found through experiments ,which can optimize the PG quantity distribution of"hot spotst" to 4.4% compared with storage system that do not use Writing_Balance algorithm. The stability of disk utilization has been improved by about 3% and the overall data balance optimization has also been significantly improved in a small input key space.

Ceph distributed storagedata distribution balancingControlled Replication Under Scalable Hashing(CRUSH)data distribution algorithm

苗宇豪、范中磊、张墨翟、杨柳

展开 >

长安大学信息工程学院,陕西西安 710064

Ceph分布式存储 数据分布均衡性 可扩展哈希下的受控复制 数据分布算法

中央高校基本科研业务费专项

CHD2011TD009

2024

微电子学与计算机
中国航天科技集团公司第九研究院第七七一研究所

微电子学与计算机

CSTPCD
影响因子:0.431
ISSN:1000-7180
年,卷(期):2024.41(3)
  • 14