machine learning; computer vision; model compression
Résumé :
[en] Abstract—Many real world computer vision applications are required to run on hardware with limited computing power, often referred to as ”edge devices”. The state of the art in computer vision continues towards ever bigger and deeper neural networks with equally rising computational requirements. Model compression methods promise to substantially reduce the computation time and memory demands with little to no impact on the model robustness. However, evaluation of the compression is mostly based on theoretic speedups in terms of required floating-point operations. This work offers a tool to profile the actual speedup offered by several compression algorithms. Our results show a significant discrepancy between the theoretical and actual speedup on various hardware setups. Furthermore, we show the potential of model compressions and highlight the importance of selecting the right compression algorithm for a target task and hardware. The code to reproduce our experiments is available at https://hub.datathings.com/papers/2022-coins.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
LORENTZ, Joe ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Hartmann, Thomas; DataThings S.A.
Moawad, Assaad; DataThings S.A.
AOUADA, Djamila ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Profiling the real world potential of neural network compression
Date de publication/diffusion :
01 août 2022
Nom de la manifestation :
2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)
Date de la manifestation :
from 01.08.2022 to 03.08.2022
Titre de l'ouvrage principal :
2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS), Barcelona 1-3 August 2022
Maison d'édition :
IEEE
ISBN/EAN :
978-1-6654-8356-8
Peer reviewed :
Peer reviewed
Projet FnR :
FNR14297122 - Towards Edge-optimized Deep Learning For Explainable Quality Control, 2019 (01/01/2020-31/12/2023) - Joe Lorentz
H. Li, K. Ota, and M. Dong, "Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing, " vol. 32, no. 1, pp. 96-101.
S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu, "A survey of deep learning techniques for autonomous driving, " vol. 37, no. 3, pp. 362-386.
E. N. Malamas, E. G. Petrakis, M. Zervakis, L. Petit, and J.-D. Legat, "A survey on industrial vision systems, applications and tools, " vol. 21, no. 2, pp. 171-188.
G. Furano, G. Meoni, A. Dunne, D. Moloney, V. Ferlet-Cavrois, A. Tavoularis, J. Byrne, L. Buckley, M. Psarakis, K.-O. Voss, and L. Fanucci, "Towards the Use of Artificial Intelligence on the Edge in Space Systems: Challenges and Opportunities, " vol. 35, no. 12, pp. 44-56.
M. Valera and S. Velastin, "Intelligent distributed surveillance systems: A review, " vol. 152, no. 2, p. 192.
M. P. Véstias, R. P. Duarte, J. T. de Sousa, and H. C. Neto, "Moving Deep Learning to the Edge, " vol. 13, no. 5, p. 125.
Y. Cheng, D. Wang, P. Zhou, and T. Zhang, "A Survey of Model Compression and Acceleration for Deep Neural Networks. " http://arxiv. org/abs/1710. 09282
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks, " in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., pp. 1097-1105. http://papers. nips. cc/paper/4824-imagenet-classificationwith-deep-convolutional-neural-networks. pdf
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "ImageNet Large Scale Visual Recognition Challenge, " vol. 115, no. 3, pp. 211-252.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, "Microsoft COCO: Common Objects in Context, " in Computer Vision-ECCV 2014, ser. Lecture Notes in Computer Science, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Springer International Publishing, vol. 8693, pp. 740-755.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition, " vol. 86, no. 11, pp. 2278-2324, Nov. /1998.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition, " in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 770-778.
K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition. " http://arxiv. org/abs/1409. 1556
C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions, " in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1-9.
D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag, "What is the State of Neural Network Pruning?" http://arxiv. org/abs/2003. 03033
Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin, "Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications. " http://arxiv. org/abs/1511. 06530
A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, "A Survey of Quantization Methods for Efficient Neural Network Inference. " http://arxiv. org/abs/2103. 13630
J. Gou, B. Yu, S. J. Maybank, and D. Tao, "Knowledge Distillation: A Survey. " http://arxiv. org/abs/2006. 05525
S. Han, J. Pool, J. Tran, and W. Dally, "Learning both Weights and Connections for Efficient Neural Network, " vol. 28, pp. 1135-1143.
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, "Pruning Filters for Efficient ConvNets. " http://arxiv. org/abs/1608. 08710
R. Krishnamoorthi, "Quantizing deep convolutional networks for efficient inference: A whitepaper. " http://arxiv. org/abs/1806. 08342
M. Yin, Y. Sui, S. Liao, and B. Yuan, "Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework, " pp. 10 674-10 683.
A. Shrikumar, P. Greenside, and A. Kundaje, "Learning Important Features Through Propagating Activation Differences, " in Proceedings of the 34th International Conference on Machine Learning-Volume 70, ser. ICML'17. JMLR. org, pp. 3145-3153. http://dl. Acm. org/citation. cfm?id=3305890. 3306006
A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian, "Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks, " in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp. 839-847.
Y. LeCun, J. S. Denker, and S. A. Solla, "Optimal Brain Damage, " p. 8.
H. Tanaka, D. Kunin, D. L. K. Yamins, and S. Ganguli, "Pruning neural networks without any data by iteratively conserving synaptic flow. " http://arxiv. org/abs/2006. 05467
Y. He, X. Zhang, and J. Sun, "Channel Pruning for Accelerating Very Deep Neural Networks, " pp. 1389-1397.
J.-H. Luo, J. Wu, and W. Lin, "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression, " in 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, pp. 5068-5076.
T. G. Kolda and B. W. Bader, "Tensor Decompositions and Applications, " vol. 51, no. 3, pp. 455-500.
V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, "Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition. " http://arxiv. org/abs/1412. 6553
A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. " http://arxiv. org/abs/1704. 04861
S. Nakajima, M. Sugiyama, S. D. Babacan, and R. Tomioka, "Global Analytic Solution of Fully-observed Variational Bayesian Matrix Factorization, " p. 37.
H. Li, S. De, Z. Xu, C. Studer, H. Samet, and T. Goldstein, "Training Quantized Nets: A Deeper Understanding, " p. 11.
T. Chen, B. Xu, C. Zhang, and C. Guestrin, "Training Deep Nets with Sublinear Memory Cost. " http://arxiv. org/abs/1604. 06174
L. Wang, J. Ye, Y. Zhao, W. Wu, A. Li, S. L. Song, Z. Xu, and T. Kraska, "SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks, " pp. 41-53.
H. Vanholder, "EFFICIENT INFERENCE WITH TENSORRT, " p. 24.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "PyTorch: An imperative style, high-performance deep learning library, " in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., pp. 8024-8035. http://papers. neurips. cc/paper/9015-pytorchan-imperative-style-high-performance-deep-learninglibrary. pdf
H. He. The State of Machine Learning Frameworks in 2019. The Gradient. https://thegradient. pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominatesindustry/
J. Lorentz, T. Hartmann, A. Moawad, F. Fouquet, and D. Aouada, "Explaining Defect Detection with Saliency Maps, " in Advances and Trends in Artificial Intelligence. From Theory to Practice, ser. Lecture Notes in Computer Science, H. Fujita, A. Selamat, J. C.-W. Lin, and M. Ali, Eds. Springer International Publishing, vol. 12799, pp. 506-518.
A. Krizhevsky, "Learning Multiple Layers of Features from Tiny Images, " p. 60.