[en] Dynamic Sparse Training (DST) is a rapidly evolving area of research that
seeks to optimize the sparse initialization of a neural network by adapting its
topology during training. It has been shown that under specific conditions, DST
is able to outperform dense models. The key components of this framework are
the pruning and growing criteria, which are repeatedly applied during the
training process to adjust the network's sparse connectivity. While the growing
criterion's impact on DST performance is relatively well studied, the influence
of the pruning criterion remains overlooked. To address this issue, we design
and perform an extensive empirical analysis of various pruning criteria to
better understand their impact on the dynamics of DST solutions. Surprisingly,
we find that most of the studied methods yield similar results. The differences
become more significant in the low-density regime, where the best performance
is predominantly given by the simplest technique: magnitude-based pruning. The
code is provided at https://github.com/alooow/fantastic_weights_paper
Disciplines :
Computer science
Author, co-author :
Nowak, Aleksandra I.; Jagiellonian University - Krakow [PL] ; IDEAS NCBR [PL]
Grooten, Bram; Eindhoven University of Technology [NL]
MOCANU, Decebal Constantin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS) ; Eindhoven University of Technology [NL] ; University of Twente [NL]
Tabor, Jacek; Jagiellonian University - Krakow [PL]
External co-authors :
yes
Language :
English
Title :
Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training
Publication date :
2023
Event name :
NeurIPS 2023: Thirty-seventh Annual Conference on Neural Information Processing Systems