ProxSGD: Training Structured Neural Networks under Regularization and Constraints

Yang, Yang; YUAN, Yaxiong; Chatzimichailidis, Avraam; Sloun, Ruud JG van; LEI, Lei; CHATZINOTAS, Symeon

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

ProxSGD: Training Structured Neural Networks under Regularization and Constraints

Yang, Yang; YUAN, Yaxiong; Chatzimichailidis, Avraam et al.

2020 • In International Conference on Learning Representations (ICLR) 2020

Peer reviewed

Permalink
https://hdl.handle.net/10993/45122

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

proxsgd_training_structured_neural_networks_under_regularization_and_constraints.pdf

Author postprint (609.63 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Disciplines :

Computer science

Author, co-author :

Yang, Yang; Fraunhofer ITWM

YUAN, Yaxiong ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom

Chatzimichailidis, Avraam; Fraunhofer ITWM

Sloun, Ruud JG van; Eindhoven University of Technology

LEI, Lei ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom

CHATZINOTAS, Symeon ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom

External co-authors :

yes

Language :

English

Title :

ProxSGD: Training Structured Neural Networks under Regularization and Constraints

Publication date :

2020

Event name :

International Conference on Learning Representations (ICLR) 2020

Event date :

30-04-2020

Main work title :

International Conference on Learning Representations (ICLR) 2020

Peer reviewed :

Peer reviewed

European Projects :

H2020 - 742648 - AGNOSTIC - Actively Enhanced Cognition based Framework for Design of Complex Systems

Funders :

CE - Commission Européenne [BE]

Available on ORBilu :

since 16 December 2020

Statistics

Number of views

106 (7 by Unilu)

Number of downloads

754 (10 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Francis Bach, Rodolphe Jenatton, Julien Mairal, and Guillaume Obozinski. Optimization with Sparsity-Inducing Penalties. Foundations and Trends in Machine Learning, 4(1):1-106, 2011. doi: 10.1561/2200000015. URL http://arxiv.org/abs/1108.0775.
Yu Bai, Yu-Xiang Wang, and Edo Liberty. Proxquant: Quantized neural networks via proximal operators. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=HyzMyhCcK7.
Dimitri P. Bertsekas and John N. Tsitsiklis. Gradient convergence in gradient methods with errors. SIAM Journal on Optimization, 10(3):627-642, 2000. doi: 10.1137/S1052623497331063. URL http://link.aip.org/link/SJOPE8/v10/i3/p627/s1&Agg=doi.
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 2010. doi: 10.1561/2200000016. URL http://www.nowpublishers.com/product.aspx?product=MAL{&}doi=2200000016.
Xiangyi Chen, Sijia Liu, Ruoyu Sun, and Mingyi Hong. On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization. In International Conference on Learning Representations, 2019. URL http://arxiv.org/abs/1808.02941.
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in Neural Information Processing Systems 28, pp. 3123-3131, 2015.
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized Neural Networks. In Advances in Neural Information Processing Systems 29, 2016. URL https://papers.nips.cc/paper/6573-binarized-neural-networks.pdf.
Trevor Gale, Erich Elsen, and Sara Hooker. The State of Sparsity in Deep Neural Networks. 2019. URL http://arxiv.org/abs/1902.09574.
Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. Deep learning, volume 1. MIT Press, 2016.
Lu Hou, Quanming Yao, and James T. Kwok. Loss-aware binarization of deep networks. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=S1oWlN9ll.
Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. Densely connected convolutional networks. In The Conference on Computer Vision and Pattern Recognition, 2017. URL https://arxiv.org/abs/1608.06993.
Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations, 2015. URL http://arxiv.org/abs/1412.6980.
Alex Krizhevsky. Learning multiple layers of features from tiny images, 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
Yunwen Lei, Ting Hu, and Ke Tang. Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions. to appear in IEEE Transactions on Neural Networks and Learning Systems, 2019. URL https://ieeexplore.ieee.org/document/8930994.
Christos Louizos, Max Welling, and Diederik P. Kingma. Learning Sparse Neural Networks through L0 Regularization. In International Conference on Learning Representations, 2018. URL http://arxiv.org/abs/1712.01312.
Liangchen Luo, Yuanhao Xiong, Yan Liu, and Xu Sun. Adaptive Gradient Method with Dynamic Bound of Learning Rate. In International Conference on Learning Representations, 2019. URL http://arxiv.org/abs/1902.09843v1.
Neal Parikh and Stephen Boyd. Proximal algorithms. Foundations and Trends in Optimization, 1 (3):127-239, 2014. doi: 10.1561/2400000003. URL http://www.nowpublishers.com/articles/foundations-and-trends-in-optimization/OPT-003.
Sashank J. Reddi, Satyen Kale, and Sanjiv Kumar. On the convergence of ADAM and beyond. In International Conference on Learning Representations, 2018. URL http://arxiv.org/abs/1904.09237.
Andrzej Ruszczyński. Feasible direction methods for stochastic programming problems. Mathematical Programming, 19(1):220-229, December 1980. doi: 10.1007/BF01581643. URL http://www.springerlink.com/index/10.1007/BF01581643.
Yang Yang, Gesualdo Scutari, Daniel P Palomar, and Marius Pesavento. A parallel decomposition method for nonconvex stochastic multi-agent optimization problems. IEEE Transactions on Signal Processing, 64(11):2949-2964, June 2016. doi: 10.1109/TSP.2016.2531627. URL http://ieeexplore.ieee.org/document/7412752/.
Penghang Yin, Shuai Zhang, Jiancheng Lyu, Stanley Osher, Yingyong Qi, and Jack Xin. BinaryRelax: A Relaxation Approach for Training Deep Neural Networks with Quantized Weights. SIAM Journal on Imaging Sciences, 11(4):2205-2223, January 2018. URL https://epubs.siam.org/doi/10.1137/18M1166134.