Stream data processing; Big Data; sustainable-throughput
Abstract :
[en] Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low-latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. To understand the upper bound of the maximum sustainable throughput that is possible for a given node configuration, we have designed multiple hard-coded multi-threaded processes (called ad-hoc dataflows) in C++ using Message Passing Interface (MPI) and Pthread libraries. Our preliminary results show that our ad-hoc design on average is 5.2 times better than Flink and 9.3 times better than Spark.
Disciplines :
Computer science
Author, co-author :
Ellampallil Venugopal, Vinu ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Theobald, Martin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
External co-authors :
no
Language :
English
Title :
Benchmarking Synchronous and Asynchronous Stream Processing Systems
Publication date :
02 January 2020
Event name :
7th ACM IKDD CoDS and 25th COMAD
Event organizer :
ACM
Event date :
from 02-01-2020 to 04-01-2020
By request :
Yes
Audience :
International
Main work title :
Benchmarking Synchronous and Asynchronous Stream Processing Systems