References of "IEEE Transactions on Parallel and Distributed Systems"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailFair Coflow Scheduling via Controlled Slowdown
De pellegrini, Francesco; Gupta, Vaibhav Kumar UL; El Azouzi, Rachid et al

in IEEE Transactions on Parallel and Distributed Systems (2022)

The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different ... [more ▼]

The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have been introduced in the literature to mitigate this fairness issue, the trade-off between fairness and efficiency of data transfer is hard to control. This paper introduces a fairness framework for coflow scheduling based on the concept of slowdown, i.e., the performance loss of a coflow compared to isolation. By controlling the slowdown it is possible to enforce a target coflow progress while minimizing the average CCT. In the proposed framework, the minimum slowdown for a batch of coflows can be determined in polynomial time. By showing the equivalence with Gaussian elimination, slowdown constraints are introduced into primal-dual iterations of the CoFair algorithm. The algorithm extends the class of the σ-order schedulers to solve the fair coflow scheduling problem in polynomial time. It provides a 4-approximation of the average CCT w.r.t. an optimal scheduler. Extensive numerical results demonstrate that this approach can trade off average CCT for slowdown more efficiently than existing state of the art schedulers. [less ▲]

Detailed reference viewed: 20 (4 UL)
Full Text
Peer Reviewed
See detailOn the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems
Ruan, Mingkang; Titcheu Chekam, Thierry UL; Zhai, Ennan et al

in IEEE Transactions on Parallel and Distributed Systems (2018), PP(99), 1-1

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each object across multiple storage nodes and leverage object sync protocols to achieve high ... [more ▼]

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each object across multiple storage nodes and leverage object sync protocols to achieve high reliability and eventual consistency. The performance of object sync protocols heavily relies on two key parameters: r (number of replicas for each object) and n (number of objects hosted by each storage node). In existing tutorials and demos, the configurations are usually r = 3 and n < 1000 by default, and the sync process seems to perform well. However, we discover in data-intensive scenarios, e.g., when r > 3 and n >> 1000, the sync process is significantly delayed and produces massive network overhead, referred to as the sync bottleneck problem. By reviewing the source code of OpenStack Swift, we find that its object sync protocol utilizes a fairly simple and network-intensive approach to check the consistency among replicas of objects. Hence in a sync round, the number of exchanged hash values per node is Theta(n x r). To tackle the problem, we propose a lightweight and practical object sync protocol, LightSync, which not only remarkably reduces the sync overhead, but also preserves high reliability and eventual consistency. LightSync derives this capability from three novel building blocks: 1) Hashing of Hashes, which aggregates all the h hash values of each data partition into a single but representative hash value with the Merkle tree; 2) Circular Hash Checking, which checks the consistency of different partition replicas by only sending the aggregated hash value to the clockwise neighbor; and 3) Failed Neighbor Handling, which properly detects and handles node failures with moderate overhead to effectively strengthen the robustness of LightSync. The design of LightSync offers provable guarantee on reducing the per-node network overhead from Theta(n x r) to Theta(n/h). Furthermore, we have implemented LightSync as an open-source patch and adopted it to OpenStack Swift, thus reducing the sync delay by up to 879x and the network overhead by up to 47.5x. [less ▲]

Detailed reference viewed: 170 (7 UL)
Full Text
Peer Reviewed
See detailLoad Balancing at the Edge of Chaos: How Can Self-Organized Criticality Lead to Energy-Efficient Computing
Laredo, Jean-Luis; Guinand, Frédéric; Damien, Olivier et al

in IEEE Transactions on Parallel and Distributed Systems (2017), 28

This paper investigates a self-organized critical approach for dynamically load-balancing computational workloads. The proposed model is based on the Bak-Tang-Wiesenfeld sandpile: a cellular automaton ... [more ▼]

This paper investigates a self-organized critical approach for dynamically load-balancing computational workloads. The proposed model is based on the Bak-Tang-Wiesenfeld sandpile: a cellular automaton that works in a critical regime at the edge of chaos. In analogy to grains of sand, tasks arrive, pile up and slip through the different processing elements or sites of the system. When a pile exceeds a certain threshold, it collapses and initiates an avalanche of migrating tasks, i.e. producing load-balancing. We show that the frequency of such avalanches is in power-law relation with their sizes, a scale-invariant fingerprint of self-organized criticality that emerges without any tuning of parameters. Such an emergent pattern has organic properties such as the self-organization of tasks into resources or the self-optimization of the computing performance. The conducted experimentation also reveals that the system is in balanced (i.e. not driving to overloaded or underutilized resources) as long as the arrival rate of tasks equals the processing power of the system. Taking advantage of this fact, we hypothesize that the processing elements can be turned on and off depending on the state of the workload as to maximize the utilization of resources. An interesting side-effect is that the overall energy consumption of the system is minimized without compromising the quality of service. [less ▲]

Detailed reference viewed: 93 (3 UL)
Full Text
Peer Reviewed
See detailMinimum Dependencies Energy-Efficient Scheduling in Data Centers
Zotkiewicz, Mateusz UL; Guzek, Mateusz UL; Kliazovich, Dzmitry UL et al

in IEEE Transactions on Parallel and Distributed Systems (2016)

Detailed reference viewed: 274 (7 UL)
Full Text
Peer Reviewed
See detailNew Algorithms for Secure Outsourcing of Modular Exponentiations
Chen, Xiaofeng; Li, Jin; Ma, Jianfeng et al

in IEEE Transactions on Parallel and Distributed Systems (2014), 25(9), 2386-2396

With the rapid development in availability of cloud services, the techniques for securely outsourcing the prohibitively expensive computations to untrusted servers are getting more and more attentions in ... [more ▼]

With the rapid development in availability of cloud services, the techniques for securely outsourcing the prohibitively expensive computations to untrusted servers are getting more and more attentions in the scientific community. Exponentiations modulo a large prime have been considered the most expensive operation in discrete-logarithm based cryptographic protocols, and the computationally limited devices such as RFID tags or smartcard may be incapable to accomplish these operations. Therefore, it is meaningful to present an efficient method to securely outsource most of this work-load to (untrusted) cloud servers. In this paper, we propose a new secure outsourcing algorithm for (variable-exponent, variable-base) exponentiation modular a prime in the two untrusted program model. Compared with the state-of-the-art algorithm \cite{HL05}, the proposed algorithm is superior in both efficiency and checkability. We then utilize this algorithm as a subroutine to achieve outsource-secure Cramer-Shoup encryptions and Schnorr signatures. Besides, we propose the first outsource-secure and efficient algorithm for simultaneous modular exponentiations. Moreover, we formally prove that both the algorithms can achieve the desired security notions. We also provide the experimental evaluation that demonstrates the efficiency and effectiveness of the proposed outsourcing algorithms and schemes. [less ▲]

Detailed reference viewed: 279 (13 UL)