Dynamic sparse training; Deep neural networks; Representation; Optimization; Generalization
Abstract :
[en] Deep Neural Networks excel in tasks like pattern recognition and natural
language processing but are often too large and computationally intensive for
embedded systems with limited hardware resources. Recently, Dynamic Sparse
Training (DST), which starts with a sparse network and simultaneously op-
timizes sparse connectivity and weights, has been empirically validated as an
effective solution that reduces the resource budget while achieving or exceeding
the performance of dense networks. Despite the phenomenal empirical success
of DST, theoretical understanding still lags behind. This Ph.D. study embarks on a comprehensive exploration of DST, through the three fundamental pillars of machine learning theory for supervised learning: representation, optimization, and generalization.
Disciplines :
Computer science
Author, co-author :
WU, Boqian ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Keulen, Maurice Van; University of Twente, Netherlands
MOCANU, Decebal Constantin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS) ; Eindhoven University of Technology, Netherlands ; University of Twente, Netherlands
Mocanu, Elena; University of Twente, Netherlands
External co-authors :
no
Language :
English
Title :
Insights into Dynamic Sparse Training: Theory Meets Practice