Reference : TensAIR: Online Learning from Data Streams via Asynchronous Iterative Routing |
E-prints/Working papers : Already available on another site | |||
Engineering, computing & technology : Computer science | |||
http://hdl.handle.net/10993/54534 | |||
TensAIR: Online Learning from Data Streams via Asynchronous Iterative Routing | |
English | |
Dalle Lucca Tosi, Mauro ![]() | |
Ellampallil Venugopal, Vinu ![]() | |
Theobald, Martin ![]() | |
2023 | |
No | |
[en] Online Learning ; Neural Networks ; Asynchronous Stream Processing ; Asynchronous Stochastic Gradient Descent | |
[en] Online learning (OL) from data streams is an emerging area of research that encompasses numerous challenges from stream processing, machine learning, and networking. Recent extensions of stream-processing platforms, such as Apache Kafka and Flink, already provide basic extensions for the training of neural networks in a stream-processing pipeline. However, these extensions are not scalable and flexible enough for many real-world use-cases, since they do not integrate the neural-network libraries as a first-class citizen into their architectures. In this paper, we present TensAIR, which provides an end-to-end dataflow engine for OL from data streams via a protocol to which we refer as asynchronous iterative routing. TensAIR supports the common dataflow operators, such as Map, Reduce, Join, and has been augmented by the data-parallel OL functions train and predict. These belong to the new Model operator, in which an initial TensorFlow model (either freshly initialized or pre-trained) is replicated among multiple decentralized worker nodes. Our decentralized architecture allows TensAIR to efficiently shard incoming data batches across the distributed model replicas, which in turn trigger the model updates via asynchronous stochastic gradient descent. We empirically demonstrate that TensAIR achieves a nearly linear scale-out in terms of (1) the number of worker nodes deployed in the network, and (2) the throughput at which the data batches arrive at the dataflow operators. We exemplify the versatility of TensAIR by investigating both sparse (Word2Vec) and dense (CIFAR-10) use-cases, for which we are able to demonstrate very significant performance improvements in comparison to Kafka, Flink, and Horovod. We also demonstrate the magnitude of these improvements by depicting the possibility of real-time concept drift adaptation of a sentiment analysis model trained over a Twitter stream. | |
http://hdl.handle.net/10993/54534 | |
https://arxiv.org/abs/2211.10280 | |
FnR ; FNR12252781 > Andreas Zilian > DRIVEN > Data-driven Computational Modelling And Applications > 01/09/2018 > 28/02/2025 > 2017 |
There is no file associated with this reference.
All documents in ORBilu are protected by a user license.