Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine

ELLAMPALLIL VENUGOPAL, Vinu; THEOBALD, Martin

Scientific presentation in universities or research centers (Scientific presentations in universities or research centers)

ELLAMPALLIL VENUGOPAL, Vinu; THEOBALD, Martin

2020

Permalink
https://hdl.handle.net/10993/45624

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Dbdbd2020.pdf

Publisher postprint (328.04 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Stream data processing; Big Data; sustainable-throughput

Abstract :

[en] Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low-latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. This paper introduces a new distributed stream data processing engine (DSPE), called “Asynchronous Iterative Routing” or simply AIR, which implements a light-weight, dynamic sharding protocol. AIR expedites a direct and asynchronous communication among all the worker nodes via multiple Message Passing Interface (MPI) communication channels and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR scales out to clusters consisting of up to 8 nodes and 224 cores, performing much better than existing DSPEs, and it performs up to 15 times better than Spark and Flink in terms of sustainable-throughput.

Disciplines :

Computer science

Author, co-author :

ELLAMPALLIL VENUGOPAL, Vinu ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

THEOBALD, Martin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

Language :

English

Title :

Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine

Publication date :

11 December 2020

Event name :

The Dutch-Belgian DataBase Day (DBDBD) 2020

Event organizer :

Software Languages Lab of the Vrije Universiteit Brussel

Event place :

Brussels, Belgium

Event date :

11-12-2020

Audience :

International

Additional URL :

https://soft.vub.ac.be/DBDBD2020/abstract/dbdbd_2020_venugopal.pdf

Available on ORBilu :

since 18 January 2021

Statistics

Number of views

146 (9 by Unilu)

Number of downloads

57 (4 by Unilu)

More statistics