Scientific presentation in universities or research centers (Scientific presentations in universities or research centers)
Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine
Ellampallil Venugopal, Vinu; Theobald, Martin
2020
 

Files


Full Text
Dbdbd2020.pdf
Publisher postprint (328.04 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Stream data processing; Big Data; sustainable-throughput
Abstract :
[en] Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low-latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. This paper introduces a new distributed stream data processing engine (DSPE), called “Asynchronous Iterative Routing” or simply AIR, which implements a light-weight, dynamic sharding protocol. AIR expedites a direct and asynchronous communication among all the worker nodes via multiple Message Passing Interface (MPI) communication channels and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR scales out to clusters consisting of up to 8 nodes and 224 cores, performing much better than existing DSPEs, and it performs up to 15 times better than Spark and Flink in terms of sustainable-throughput.
Disciplines :
Computer science
Author, co-author :
Ellampallil Venugopal, Vinu ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Theobald, Martin ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Language :
English
Title :
Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine
Publication date :
11 December 2020
Event name :
The Dutch-Belgian DataBase Day (DBDBD) 2020
Event organizer :
Software Languages Lab of the Vrije Universiteit Brussel
Event place :
Brussels, Belgium
Event date :
11-12-2020
Audience :
International
Available on ORBilu :
since 18 January 2021

Statistics


Number of views
145 (9 by Unilu)
Number of downloads
54 (4 by Unilu)

Bibliography


Similar publications



Contact ORBilu