首页|HyperPart: A Hypergraph-Based Abstraction for Deduplicated Storage Systems

HyperPart: A Hypergraph-Based Abstraction for Deduplicated Storage Systems

扫码查看
Currently, deduplication techniques are utilized to minimize the space overhead by deleting redundant data blocks across large-scale servers in data centers. However, such a process exacerbates the fragmentation of data blocks, causing more cross-server file retrievals with plummeting retrieval throughput. Some attempts prefer better file retrieval performance by confining all blocks of a file to one single server, resulting in non-trivial space consumption for more replicated blocks across servers. An ideal network storage system, in effect, should take both the deduplication and retrieval performance into account by implementing reasonable assignment of the detected unique blocks. Such a fine-grained assignment requires an accurate and comprehensive abstraction of the files, blocks, and the file-block affiliation relationships. To achieve this, we innovatively design the weighted hypergraph to profile the multivariate data correlations. With this delicate abstraction in place, we propose HyperPart, which elegantly transforms this complex block allocation problem into a hypergraph partition problem. For more general scenarios with dynamic file updates, we further propose a two-phase incremental hypergraph repartition scheme, which mitigates the performance degradation with minimal migration volume. We implement a prototype system of HyperPart, and the experiment results validate that it saves around 50% of the storage space and improves the retrieval throughput by approximately 30% of state-of-the-art methods under the balance constraints.

ServersFingerprint recognitionThroughputMetadataResource managementSwitchesData centersIndexingEthernetCorrelation

Geyao Cheng、Junxu Xia、Lailong Luo、Haibo Mi、Deke Guo、Richard T. B. Ma

展开 >

State Key Laboratory of Complex and Critical Software Environment, College of Information and Communication, National University of Defense Technology, Wuhan, China

National Key Laboratory of Information Systems Engineering, National University of Defense Technology, Changsha, China

National Key Laboratory of Information Systems Engineering, National University of Defense Technology, Changsha, China|National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China

School of Computing, National University of Singapore, Singapore

展开 >

2025

IEEE transactions on cloud computing

IEEE transactions on cloud computing

ISSN:
年,卷(期):2025.13(1)
  • 35