Paper published in a book (Scientific congresses, symposiums and conference proceedings)
TensAIR: Real-Time Training of Neural Networks from Data-streams
DALLE LUCCA TOSI, Mauro; Venugopal, Vinu E.; THEOBALD, Martin
2024In ICMLSC '24: Proceedings of the 2024 8th International Conference on Machine Learning and Soft Computing
Peer reviewed
 

Files


Full Text
TensAIR Real-Time Training of Neural Networks from Data-streams - 3647750.3647762.pdf
Publisher postprint (476 kB) Creative Commons License - Attribution
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Neural Networks; Distributed Artificial Intelligence; Real-time systems
Abstract :
[en] Online learning (OL) from data streams is an emerging area of research that encompasses numerous challenges from stream processing, machine learning, and networking. Stream-processing platforms, such as Apache Kafka and Flink, have basic extensions for the training of Artificial Neural Networks (ANNs) in a stream-processing pipeline. However, these extensions were not designed to train ANNs in real-time, and they suffer from performance and scalability issues when doing so. This paper presents TensAIR, the first OL system for training ANNs in real time. TensAIR achieves remarkable performance and scalability by using a decentralized and asynchronous architecture to train ANN models (either freshly initialized or pre-trained) via DASGD (decentralized and asynchronous stochastic gradient descent). We empirically demonstrate that TensAIR achieves a nearly linear scale-out performance in terms of (1) the number of worker nodes deployed in the network, and (2) the throughput at which the data batches arrive at the dataflow operators. We depict the versatility of TensAIR by investigating both sparse (word embedding) and dense (image classification) use cases, for which TensAIR achieved from 6 to 116 times higher sustainable throughput rates than state-of-the-art systems for training ANN in a stream-processing pipeline.
Disciplines :
Computer science
Author, co-author :
DALLE LUCCA TOSI, Mauro  ;  University of Luxembourg > Faculty of Science, Technology and Medicine > Department of Computer Science > Team Martin THEOBALD
Venugopal, Vinu E. ;  IIIT Bangalore, India
THEOBALD, Martin  ;  University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
External co-authors :
yes
Language :
English
Title :
TensAIR: Real-Time Training of Neural Networks from Data-streams
Publication date :
12 April 2024
Event name :
8th International Conference on Machine Learning and Soft Computing
Event place :
Singapore
Event date :
from 26 to 28 January 2024
Audience :
International
Main work title :
ICMLSC '24: Proceedings of the 2024 8th International Conference on Machine Learning and Soft Computing
Publisher :
Association for Computing Machinery
ISBN/EAN :
979-8-4007-1654-6
Pages :
73-82
Peer reviewed :
Peer reviewed
Focus Area :
Computational Sciences
FnR Project :
FNR12252781 - Data-driven Computational Modelling And Applications, 2017 (01/09/2018-28/02/2025) - Andreas Zilian
Name of the research project :
R-AGR-3440 - PRIDE17/12252781 DRIVEN_Common - ZILIAN Andreas
Funders :
Fonds National de la Recherche Luxembourg
Funding number :
PRIDE17/12252781
Available on ORBilu :
since 15 April 2024

Statistics


Number of views
100 (4 by Unilu)
Number of downloads
53 (0 by Unilu)

OpenAlex citations
 
2

Bibliography


Similar publications



Contact ORBilu