Profiling the real world potential of neural network compression

Lorentz, Joe; Hartmann, Thomas; Moawad, Assaad; Aouada, Djamila

doi:10.1109/COINS54846.2022.9854973

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Profiling the real world potential of neural network compression

Lorentz, Joe; Hartmann, Thomas; Moawad, Assaad et al.

2022 • In 2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Barcelona 1-3 August 2022

Peer reviewed

Permalink
https://hdl.handle.net/10993/52409

DOI
10.1109/COINS54846.2022.9854973

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Profiling the real world potential of neural network compression-accepted.pdf

Author postprint (173.42 kB)

Download

The final authenticated version is available online at https://doi.ieeecomputersociety.org/10.1109/COINS54846.2022.9854973

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

machine learning; computer vision; model compression

Abstract :

[en] Abstract—Many real world computer vision applications are required to run on hardware with limited computing power, often referred to as ”edge devices”. The state of the art in computer vision continues towards ever bigger and deeper neural networks with equally rising computational requirements. Model compression methods promise to substantially reduce the computation time and memory demands with little to no impact on the model robustness. However, evaluation of the compression is mostly based on theoretic speedups in terms of required floating-point operations. This work offers a tool to profile the actual speedup offered by several compression algorithms. Our results show a significant discrepancy between the theoretical and actual speedup on various hardware setups. Furthermore, we show the potential of model compressions and highlight the importance of selecting the right compression algorithm for a target task and hardware. The code to reproduce our experiments is available at https://hub.datathings.com/papers/2022-coins.

Disciplines :

Computer science

Author, co-author :

Lorentz, Joe ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

Hartmann, Thomas; DataThings S.A.

Moawad, Assaad; DataThings S.A.

Aouada, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2

External co-authors :

Language :

English

Title :

Profiling the real world potential of neural network compression

Publication date :

01 August 2022

Event name :

2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)

Event date :

from 01.08.2022 to 03.08.2022

Main work title :

2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Barcelona 1-3 August 2022

Publisher :

IEEE

ISBN/EAN :

978-1-6654-8356-8

Peer reviewed :

Peer reviewed

FnR Project :

FNR14297122 - Towards Edge-optimized Deep Learning For Explainable Quality Control, 2019 (01/01/2020-31/12/2023) - Joe Lorentz

Available on ORBilu :

since 12 October 2022

Statistics

Number of views

47 (8 by Unilu)

Number of downloads

58 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

H. Li, K. Ota, and M. Dong, "Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing, " vol. 32, no. 1, pp. 96-101.
S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu, "A survey of deep learning techniques for autonomous driving, " vol. 37, no. 3, pp. 362-386.
E. N. Malamas, E. G. Petrakis, M. Zervakis, L. Petit, and J.-D. Legat, "A survey on industrial vision systems, applications and tools, " vol. 21, no. 2, pp. 171-188.
G. Furano, G. Meoni, A. Dunne, D. Moloney, V. Ferlet-Cavrois, A. Tavoularis, J. Byrne, L. Buckley, M. Psarakis, K.-O. Voss, and L. Fanucci, "Towards the Use of Artificial Intelligence on the Edge in Space Systems: Challenges and Opportunities, " vol. 35, no. 12, pp. 44-56.
M. Valera and S. Velastin, "Intelligent distributed surveillance systems: A review, " vol. 152, no. 2, p. 192.
M. P. Véstias, R. P. Duarte, J. T. de Sousa, and H. C. Neto, "Moving Deep Learning to the Edge, " vol. 13, no. 5, p. 125.
Y. Cheng, D. Wang, P. Zhou, and T. Zhang, "A Survey of Model Compression and Acceleration for Deep Neural Networks. " http://arxiv. org/abs/1710. 09282
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks, " in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., pp. 1097-1105. http://papers. nips. cc/paper/4824-imagenet-classificationwith-deep-convolutional-neural-networks. pdf
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "ImageNet Large Scale Visual Recognition Challenge, " vol. 115, no. 3, pp. 211-252.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, "Microsoft COCO: Common Objects in Context, " in Computer Vision-ECCV 2014, ser. Lecture Notes in Computer Science, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Springer International Publishing, vol. 8693, pp. 740-755.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition, " vol. 86, no. 11, pp. 2278-2324, Nov. /1998.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition, " in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 770-778.
K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition. " http://arxiv. org/abs/1409. 1556
C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions, " in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1-9.
D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag, "What is the State of Neural Network Pruning?" http://arxiv. org/abs/2003. 03033
Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, "Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications. " http://arxiv. org/abs/1511. 06530
A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, "A Survey of Quantization Methods for Efficient Neural Network Inference. " http://arxiv. org/abs/2103. 13630
J. Gou, B. Yu, S. J. Maybank, and D. Tao, "Knowledge Distillation: A Survey. " http://arxiv. org/abs/2006. 05525
S. Han, J. Pool, J. Tran, and W. Dally, "Learning both Weights and Connections for Efficient Neural Network, " vol. 28, pp. 1135-1143.
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, "Pruning Filters for Efficient ConvNets. " http://arxiv. org/abs/1608. 08710
R. Krishnamoorthi, "Quantizing deep convolutional networks for efficient inference: A whitepaper. " http://arxiv. org/abs/1806. 08342
M. Yin, Y. Sui, S. Liao, and B. Yuan, "Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework, " pp. 10 674-10 683.
A. Shrikumar, P. Greenside, and A. Kundaje, "Learning Important Features Through Propagating Activation Differences, " in Proceedings of the 34th International Conference on Machine Learning-Volume 70, ser. ICML'17. JMLR. org, pp. 3145-3153. http://dl. Acm. org/citation. cfm?id=3305890. 3306006
A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian, "Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks, " in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp. 839-847.
Y. LeCun, J. S. Denker, and S. A. Solla, "Optimal Brain Damage, " p. 8.
H. Tanaka, D. Kunin, D. L. K. Yamins, and S. Ganguli, "Pruning neural networks without any data by iteratively conserving synaptic flow. " http://arxiv. org/abs/2006. 05467
Y. He, X. Zhang, and J. Sun, "Channel Pruning for Accelerating Very Deep Neural Networks, " pp. 1389-1397.
J.-H. Luo, J. Wu, and W. Lin, "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression, " in 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, pp. 5068-5076.
T. G. Kolda and B. W. Bader, "Tensor Decompositions and Applications, " vol. 51, no. 3, pp. 455-500.
V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, "Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition. " http://arxiv. org/abs/1412. 6553
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. " http://arxiv. org/abs/1704. 04861
S. Nakajima, M. Sugiyama, S. D. Babacan, and R. Tomioka, "Global Analytic Solution of Fully-observed Variational Bayesian Matrix Factorization, " p. 37.
H. Li, S. De, Z. Xu, C. Studer, H. Samet, and T. Goldstein, "Training Quantized Nets: A Deeper Understanding, " p. 11.
T. Chen, B. Xu, C. Zhang, and C. Guestrin, "Training Deep Nets with Sublinear Memory Cost. " http://arxiv. org/abs/1604. 06174
L. Wang, J. Ye, Y. Zhao, W. Wu, A. Li, S. L. Song, Z. Xu, and T. Kraska, "SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks, " pp. 41-53.
H. Vanholder, "EFFICIENT INFERENCE WITH TENSORRT, " p. 24.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An imperative style, high-performance deep learning library, " in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., pp. 8024-8035. http://papers. neurips. cc/paper/9015-pytorchan-imperative-style-high-performance-deep-learninglibrary. pdf
H. He. The State of Machine Learning Frameworks in 2019. The Gradient. https://thegradient. pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominatesindustry/
J. Lorentz, T. Hartmann, A. Moawad, F. Fouquet, and D. Aouada, "Explaining Defect Detection with Saliency Maps, " in Advances and Trends in Artificial Intelligence. From Theory to Practice, ser. Lecture Notes in Computer Science, H. Fujita, A. Selamat, J. C.-W. Lin, and M. Ali, Eds. Springer International Publishing, vol. 12799, pp. 506-518.
A. Krizhevsky, "Learning Multiple Layers of Features from Tiny Images, " p. 60.