Continual Learning with Dynamic Sparse Training: Exploring Algorithms  for Effective Model Updates

Yildirim, Murat Onur; Gok Yildirim, Elif Ceren; Sokar, Ghada; MOCANU, Decebal Constantin; Vanschoren, Joaquin

Download

Paper published on a website (Scientific congresses, symposiums and conference proceedings)

Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

Yildirim, Murat Onur; Gok Yildirim, Elif Ceren; Sokar, Ghada et al.

2023 • CPAL 2024: Conference on Parsimony and Learning

Peer reviewed

Permalink
https://hdl.handle.net/10993/59772

arXiV
2308.14831v2

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

2308.14831.pdf

Author postprint (1.14 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Computer Science - Learning; Computer Science - Computer Vision and Pattern Recognition; Continual Learning; Sparse Training

Abstract :

[en] Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible. To this end; regularization, replay, architecture, and parameter isolation approaches were introduced to the literature. Parameter isolation using a sparse network which enables to allocate distinct parts of the neural network to different tasks and also allows to share of parameters between tasks if they are similar. Dynamic Sparse Training (DST) is a prominent way to find these sparse networks and isolate them for each task. This paper is the first empirical study investigating the effect of different DST components under the CL paradigm to fill a critical research gap and shed light on the optimal configuration of DST for CL if it exists. Therefore, we perform a comprehensive study in which we investigate various DST components to find the best topology per task on well-known CIFAR100 and miniImageNet benchmarks in a task-incremental CL setup since our primary focus is to evaluate the performance of various DST criteria, rather than the process of mask selection. We found that, at a low sparsity level, Erdos-R\'enyi Kernel (ERK) initialization utilizes the backbone more efficiently and allows to effectively learn increments of tasks. At a high sparsity level, unless it is extreme, uniform initialization demonstrates a more reliable and robust performance. In terms of growth strategy; performance is dependent on the defined initialization strategy and the extent of sparsity. Finally, adaptivity within DST components is a promising way for better continual learners.

Disciplines :

Computer science

Author, co-author :

Yildirim, Murat Onur; Eindhoven University of Technology [NL]

Gok Yildirim, Elif Ceren; Eindhoven University of Technology [NL]

Sokar, Ghada; Eindhoven University of Technology [NL]

MOCANU, Decebal Constantin ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

Vanschoren, Joaquin; Eindhoven University of Technology [NL]

External co-authors :

yes

Language :

English

Title :

Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

Publication date :

20 November 2023

Event name :

CPAL 2024: Conference on Parsimony and Learning

Event date :

from 2 to 4 January 2024

Audience :

International

Peer reviewed :

Peer reviewed

Source :

https://openreview.net/forum?id=kkz4BbquBy

Focus Area :

Computational Sciences

Development Goals :

9. Industry, innovation and infrastructure

Additional URL :

https://github.com/muratonuryildirim/CL-with-DST

Available on ORBilu :

since 15 January 2024

Statistics

Number of views

13 (0 by Unilu)

Number of downloads

8 (0 by Unilu)

More statistics