[en] Artificial Neural Networks (ANNs) have drawn academy and industry attention for their ability to represent and solve complex problems. Researchers are studying how to distribute their computation to reduce their training time. However, the most common approaches in this direction are synchronous, letting computational resources sub-utilized. Asynchronous training does not have this drawback but is impacted by staled gradient updates, which have not been extended researched yet. Considering this, we experimentally investigate how stale gradients affect the convergence time and loss value of an ANN. In particular, we analyze an asynchronous distributed implementation of a Word2Vec model, in which the impact of staleness is negligible and can be ignored considering the computational speedup we achieve by allowing the staleness.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
DALLE LUCCA TOSI, Mauro ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Ellampallil Venugopal, Vinu
THEOBALD, Martin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Convergence time analysis of Asynchronous Distributed Artificial Neural Networks
Date de publication/diffusion :
2022
Nom de la manifestation :
CODS-COMAD 2022: 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD)
Date de la manifestation :
from 07-01-2022 to 10-01-2022
Manifestation à portée :
International
Titre de l'ouvrage principal :
5th Joint International Conference on Data Science Management of Data (9th ACM IKDD CODS and 27th COMAD)
Pagination :
314--315
Peer reviewed :
Peer reviewed
Projet FnR :
FNR12252781 - Data-driven Computational Modelling And Applications, 2017 (01/09/2018-28/02/2025) - Andreas Zilian