On the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems

Titcheu Chekam, Thierry; Ennan, Zhai; Zhenhua, Li; Yong, Cui; Kui, Ren

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

On the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems

Titcheu Chekam, Thierry; Ennan, Zhai; Zhenhua, Li et al.

2016 • In IEEE International Conference on Computer Communications, San Francisco, CA 10-15 April 2016

Peer reviewed

Permalink
https://hdl.handle.net/10993/23826

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

main.pdf

Author postprint (556.91 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Cloud storage; OpenStack Swift; Synchronization Bottleneck

Abstract :

[en] As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each data object across multiple storage nodes and leverage object sync protocols to achieve high availability and eventual consistency. The performance of object sync protocols heavily relies on two key parameters: r (number of replicas for each object) and n (number of objects hosted by each storage node). In existing tutorials and demos, the configurations are usually r = 3 and n < 1000 by default, and the object sync process seems to perform well. To deep understand object sync protocols, we first make a lab-scale OpenStack Swift deployment and run experiments with various configurations. We discover that in data-intensive scenarios, e.g., when r > 3 and n >> 1000, the object sync process is significantly delayed and produces massive network overhead. This phenomenon is referred to as the sync bottleneck problem. Then, to explore the root cause, we review the source code of OpenStack Swift and find that its object sync protocol utilizes a fairly simple and network-intensive approach to check the consistency among replicas of objects. In particular, each storage node is required to periodically multicast the hash values of all its hosted objects to all the other replica nodes. Thus in a sync round, the number of exchanged hash values per node is Theta (n* r). Further, to tackle the problem, we propose a lightweight object sync protocol called LightSync. It remarkably reduces the sync overhead by using two novel building blocks: 1) Hashing of Hashes, which aggregates all the h hash values of each data partition into a single but representative hash value with the Merkle tree; 2) Circular Hash Checking, which checks the consistency of different partition replicas by only sending the aggregated hash value to the clockwise neighbor. Its design provably reduces the per-node network overhead from Theta(n* r) to Theta (n/h ). In addition, we have implemented LightSync as an opensource patch and adopted it to OpenStack Swift, thus reducing sync delay by up to 28.8X and network overhead by up to 14.2X .

Disciplines :

Computer science

Author, co-author :

Titcheu Chekam, Thierry ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) ; Tsinghua University > School of Software, TNLIST, and KLISS MoE,

Ennan, Zhai; Yale University > Computer Science

Zhenhua, Li; Tsinghua University > School of Software, TNLIST, and KLISS MoE,

Yong, Cui; Tsinghua University > Computer Science and Technology

Kui, Ren; SUNY Buffalo > Computer Science and Engineering

External co-authors :

yes

Language :

English

Title :

On the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems

Alternative titles :

[zh] OpenStack类云存储服务的同步瓶颈研究

Publication date :

April 2016

Event name :

IEEE International Conference on Computer Communications 2016 (INFOCOM 2016)

Event organizer :

IEEE INFOCOM Organizing Committee

Event place :

San Francisco, United States - California

Event date :

10-04-2016 to 15-04-2016

Audience :

International

Main work title :

IEEE International Conference on Computer Communications, San Francisco, CA 10-15 April 2016

Publisher :

IEEE Xplore®

Pages :

Peer reviewed :

Peer reviewed

Available on ORBilu :

since 24 January 2016

Statistics

Number of views

324 (18 by Unilu)

Number of downloads

627 (8 by Unilu)

More statistics