首页期刊导航|IEEE transactions on cloud computing
期刊信息/Journal information
IEEE transactions on cloud computing
The Institute of Electrical and Electronics Engineers, Inc.
IEEE transactions on cloud computing

The Institute of Electrical and Electronics Engineers, Inc.

季刊

IEEE transactions on cloud computing/Journal IEEE transactions on cloud computing
正式出版
收录年代

    Optimizing Renewable Energy Utilization in Cloud Data Centers Through Dynamic Overbooking: An MDP-Based Approach

    Tuhin ChakrabortyCarlo KoppAdel N. Toosi
    1-17页
    查看更多>>摘要:The shift towards renewable energy sources for powering data centers is increasingly important in the era of cloud computing. However, integrating renewable energy sources into cloud data centers presents a challenge due to their variable and intermittent nature. The unpredictable workload demands in cloud data centers further complicate this problem. In response to this pressing challenge, we propose a novel approach in this paper: adapting the workload to match the renewable energy supply. Our solution involves dynamic overbooking of resources, providing energy flexibility to data center operators. We propose a framework that stochastically models both workload and energy source information, leveraging Markov Decision Processes (MDP) to determine the optimal overbooking degree based on the workload flexibility of data center clients. We validate the proposed algorithm in realistic settings through extensive simulations. Results demonstrate the superiority of our proposed method over existing approaches, achieving better matching with the renewable energy supply by 55.6%, 34.65%, and 40.7% for workload traces from Nectar Cloud, Google, and Wikipedia, respectively.

    Multi-Granularity Federated Learning by Graph-Partitioning

    Ziming DaiYunfeng ZhaoChao QiuXiaofei Wang...
    18-33页
    查看更多>>摘要:In edge computing, energy-limited distributed edge clients present challenges such as heterogeneity, high energy consumption, and security risks. Traditional blockchain-based federated learning (BFL) struggles to address all three of these challenges simultaneously. This article proposes a Graph-Partitioning Multi-Granularity Federated Learning method on a consortium blockchain, namely GP-MGFL. To reduce the overall communication overhead, we adopt a balanced graph partitioning algorithm while introducing observer and consensus nodes. This method groups clients to minimize high-cost communications and focuses on the guidance effect within each group, thereby ensuring effective guidance with reduced overhead. To fully leverage heterogeneity, we introduce a cross-granularity guidance mechanism. This mechanism involves fine-granularity models guiding coarse-granularity models to enhance the accuracy of the latter models. We also introduce a credit model to adjust the contribution of models to the global model dynamically and to dynamically select leaders responsible for model aggregation. Finally, we implement a prototype system on real physical hardware and compare it with several baselines. Experimental results show that the accuracy of the GP-MGFL algorithm is 5.6% higher than that of ordinary BFL algorithms. In addition, compared to other grouping methods, such as greedy grouping, the accuracy of the proposed method improves by about 1.5%. In scenarios with malicious clients, the maximum accuracy improvement reaches 11.1%. We also analyze and summarize the impact of grouping and the number of clients on the model, as well as the impact of this method on the inherent security of the blockchain itself.

    A Method to Compare Scaling Algorithms for Cloud-Based Services

    Danny De VleeschauwerChia-Yu ChangPaola SotoYorick De Bock...
    34-45页
    查看更多>>摘要:Nowadays, many services are offered via the cloud, i.e., they rely on interacting software components that can run on a set of connected Commercial Off-The-Shelf (COTS) servers sitting in data centers. As the demand for any particular service evolves over time, the computational resources associated with the service must be scaled accordingly while keeping the Key Performance Indicators (KPIs) associated with the service under control. Consequently, scaling always involves a delicate trade-off between using the cloud resources and complying with the KPIs. In this paper, we show that a (workload-dependent) Pareto front embodies this trade-off’s limits. We identify this Pareto front for various workloads and assess the ability of several scaling algorithms to approach that Pareto front.

    HyperPart: A Hypergraph-Based Abstraction for Deduplicated Storage Systems

    Geyao ChengJunxu XiaLailong LuoHaibo Mi...
    46-60页
    查看更多>>摘要:Currently, deduplication techniques are utilized to minimize the space overhead by deleting redundant data blocks across large-scale servers in data centers. However, such a process exacerbates the fragmentation of data blocks, causing more cross-server file retrievals with plummeting retrieval throughput. Some attempts prefer better file retrieval performance by confining all blocks of a file to one single server, resulting in non-trivial space consumption for more replicated blocks across servers. An ideal network storage system, in effect, should take both the deduplication and retrieval performance into account by implementing reasonable assignment of the detected unique blocks. Such a fine-grained assignment requires an accurate and comprehensive abstraction of the files, blocks, and the file-block affiliation relationships. To achieve this, we innovatively design the weighted hypergraph to profile the multivariate data correlations. With this delicate abstraction in place, we propose HyperPart, which elegantly transforms this complex block allocation problem into a hypergraph partition problem. For more general scenarios with dynamic file updates, we further propose a two-phase incremental hypergraph repartition scheme, which mitigates the performance degradation with minimal migration volume. We implement a prototype system of HyperPart, and the experiment results validate that it saves around 50% of the storage space and improves the retrieval throughput by approximately 30% of state-of-the-art methods under the balance constraints.

    A Run-Time Framework for Ensuring Zero-Trust State of Client’s Machines in Cloud Environment

    Devki Nandan JhaGraham LentonJames AskerDavid Blundell...
    61-74页
    查看更多>>摘要:With the unprecedented demand for cloud computing, ensuring trust in the underlying environment is challenging. Applications executing in the cloud are prone to attacks of different types including malware, network and data manipulation. These attacks may remain undetected for a significant length of time thus causing a lack of trust. Untrusted cloud services can also lead to business losses in many cases and therefore need urgent attention. In this paper, we present Trusted Public Cloud (TPC), a generic framework ensuring the Zero-trust security of client machine. It tracks the system state, alerting the user of unexpected changes in the machine’s state, thus increasing the run-time detection of security vulnerabilities. We validated TPC on Microsoft Azure with Local, Software Trusted Platform Module (SWTPM) and Software Guard Extension (SGX)-enabled SWTPM security providers. We also evaluated the scalability of TPC on Amazon Web Services (AWS) with a varying number of client machines executing in a concurrent environment. The execution results show the effectiveness of TPC as it takes a maximum of 35.6 seconds to recognise the system state when there are 128 client machines attached.

    An Efficient Delegatable Order-Revealing Encryption Scheme for Multi-User Range Queries

    Jingru XuCong PengRui LiJintao Fu...
    75-86页
    查看更多>>摘要:To balance data confidentiality and availability, order-revealing encryption (ORE) has emerged as a pivotal primitive facilitating range queries on encrypted data. However, challenges arise in diverse user domains where data is encrypted with different keys, giving rise to the development of delegatable order-revealing encryption (DORE) schemes. Regrettably, existing DORE schemes are susceptible to authorization token forgery attacks and rely on computationally intensive bilinear pairings. This work proposes a novel solution to address these challenges. We first introduce a delegatable equality-revealing encryption scheme, enabling the comparison of ciphertexts encrypted by distinct secret keys through authorization tokens. Building upon this, we present a delegatable order-revealing encryption that leverages bitwise encryption. DORE supports efficient multi-user ciphertext comparison while robustly resisting authorization token forgery attacks. Significantly, our approach distinguishes itself by minimizing bilinear pairings. Experimental results highlight the efficacy of DORE, showcasing a notable speedup of $2.8\times$ in encryption performance and $1.33\times$ in comparison performance compared to previous DORE schemes, respectively.

    Optical Self-Adjusting Data Center Networks in the Scalable Matching Model

    Caio Alves CaldeiraOtávio Augusto de Oliveira SouzaOlga GoussevskaiaStefan Schmid...
    87-98页
    查看更多>>摘要:Self-Adjusting Networks (SAN) optimize their physical topology toward the demand in an online manner. Their application in data center networks is motivated by emerging hardware technologies, such as 3D MEMS Optical Circuit Switches (OCS). The Matching Model (MM) has been introduced to study the hybrid architecture of such networks. It abstracts from the electrical switches and focuses on the added (reconfigurable) optical ones. MM defines any SAN topology as a union of matchings over a set of top-of-rack (ToR) nodes, and assumes that rearranging the edges of a single matching comes at a fixed cost. In this work, we propose and study the Scalable Matching Model (SMM), a generalization of the MM, and present OpticNet, a framework that maps a set of ToRs to a set of OCSs to form a SAN topology. We prove that OpticNet uses the minimum number of switches to realize any bounded-degree topology and allows existing SAN algorithms to run on top of it, while preserving amortized performance guarantees. Our experimental results based on real workloads show that OpticNet is a flexible and efficient framework for the implementation and evaluation of SAN algorithms in reconfigurable data center environments.

    Efficient Dynamic Resource Management for Spatial Multitasking GPUs

    Hoda SedighiDaniel GehbergerAmin EbrahimzadehFetahi Wuhib...
    99-117页
    查看更多>>摘要:The advent of microservice architecture enables complex cloud applications to be realized via a set of individually isolated components, increasing their flexibility and performance. As these applications require massive computing resources, graphics processing units (GPUs) are being widely used as high-speed parallel computing devices to meet the stringent demands. Although current GPUs allow application components to be executed concurrently via spatial multitasking, they face several challenges. The first challenge is allocating the computing resources to components dynamically to maximize efficiency. The second challenge is avoiding performance degradation caused by the data transfer overhead between the components. To address these challenges, we propose an efficient GPU resource management technique that dynamically allocates GPU resources to application components. The proposed method allocates resources based on component workloads and uses online performance monitoring to guarantee the application's performance. We also propose a GPU memory manager to reduce the data transfer overhead between components via shared memory. Our evaluation results indicate that the proposed dynamic resource allocation method improves application throughput by up to 134.12% compared to the state-of-the-art spatial multitasking techniques. We also show that using a shared memory results in 6x throughput improvement compared to the baseline User Datagram Protocol (UDP)-based technique.

    IBNR-RD: Intra-Block Neighborhood Relationship-Based Resemblance Detection for High-Performance Multi-Node Post-Deduplication

    Dewen ZengWenlong TianTingting HeRuixuan Li...
    118-129页
    查看更多>>摘要:Post-deduplication in traditional cloud environments primarily focuses on single-node, where delta compression is performed on the same deduplication node located on server side. However, with data explosion, the multi-node post-deduplication, also called global deduplication, has become a hot issue in research communities, which aims to simultaneously execute delta compression on data distributed across all nodes. Simply setting up single-node deduplication systems on multi-node environments would significantly affect storage utilization and incur secondary overhead from file migration. Nevertheless, existing global deduplication solutions suffer from lower data compression ratios and high computational overhead due to their resemblance detection's inherent limitations and overly coarse granularities. Similar blocks typically have high correlations between sub-blocks; inspired by this observation, we propose IBNR (Intra-Block Neighborhood Relationship-Based Resemblance Detection for High-Performance Multi-Node Post-Deduplication), which introduces a novel resemblance detection based on relationships between sub-blocks and determines the ownership of blocks in entry stage to achieve efficient global deduplication. Furthermore, the by-products of IBNR have shown powerful scalability by replacing internal resemblance detection scheme with existing solutions on practical workloads. Experimental results indicate that IBNR outperforms state-of-the-art solutions, achieving an average 1.99× data reduction ratio and varying degrees of improvement across other key metrics.

    SST-LOF: Container Anomaly Detection Method Based on Singular Spectrum Transformation and Local Outlier Factor

    Shilei BuMinpeng JinJie WangYulai Xie...
    130-147页
    查看更多>>摘要:In recent years, the use of container cloud platforms has experienced rapid growth. However, because containers are operating-system-level virtualization, their isolation is far less than that of virtual machines, posing considerable challenges for multi-tenant container cloud platforms. To address the issues associated with current container anomaly detection algorithms, such as the difficulty in mining periodic features and the high rate of false positives due to noisy data, we propose an anomaly detection method named SST-LOF, based on singular spectrum transformation and the local outlier factor. Our method enhances the traditional Singular Spectrum Transformation (SST) algorithm to meet the needs of streaming unsupervised detection. Furthermore, our method improves the calculation mode of the anomaly score of the Local Outlier Factor algorithm (LOF) and reduces false positives of noisy data with dynamic sliding windows. Additionally, we have designed and implemented a container cloud anomaly detection system that can perform real-time, unsupervised, streaming anomaly detection on containers quickly and accurately. The experimental results demonstrate the effectiveness and efficiency of our method in detecting anomalies in containers in both simulated and real cloud environments.