Machine Learning; Adversarial examples; Bayesian; Neural Networks; Deep Learning; Transferability
Résumé :
[en] An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the posterior distribution of neural network weights, which represents the belief about the value of each parameter. Our extensive experiments on ImageNet, CIFAR-10 and MNIST show that our approach improves the success rates of four state-of-the-art attacks significantly (up to 83.2 percentage points), in both intra-architecture and inter-architecture transferability. On ImageNet, our approach can reach 94% of success rate while reducing training computations from 11.6 to 2.4 exaflops, compared to an ensemble of independently trained DNNs. Our vanilla surrogate achieves 87.5% of the time higher transferability than three test-time techniques designed for this purpose. Our work demonstrates that the way to train a surrogate has been overlooked, although it is an important element of transfer-based attacks. We are, therefore, the first to review the effectiveness of several training methods in increasing transferability. We provide new directions to better understand the transferability phenomenon and offer a simple but strong baseline for future work.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
GUBRI, Martin ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Sen, Koushik; University of California, Berkeley > Computer Sciences Division
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
Efficient and Transferable Adversarial Examples from Bayesian Neural Networks
Date de publication/diffusion :
2022
Nom de la manifestation :
CONFERENCE IN UNCERTAINTY IN ARTIFICIAL INTELLIGENCE
Date de la manifestation :
from 01-08-2022 to 05-08-2022
Manifestation à portée :
International
Titre du périodique :
The 38th Conference on Uncertainty in Artificial Intelligence
Arsenii Ashukha, Alexander Lyzhov, Dmitry Molchanov, and Dmitry Vetrov. Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning. ICLR, 2 2020. ISSN 2331-8422. URL http://arxiv.org/abs/2002.06470.
Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML, 2 2018.
Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. Evasion attacks against machine learning at test time. In Lecture Notes in Computer Science, number PART 3, pages 387-402, 8 2013. doi: 10.1007/978-3-642-40994-3{\_}25.
Ginevra Carbone, Matthew Wicker, Luca Laurenti, Andrea Patane, Luca Bortolussi, and Guido Sanguinetti. Robustness of Bayesian neural networks to gradient-based attacks. In NeurIPS. arXiv, 2 2020.
Zhaohui Che. A New Ensemble Adversarial Attack Powered by Long-Term Gradient Memories. AAAI, pages 3405-3413, 2020.
Francesco Croce and Matthias Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, volume PartF16814, pages 2184-2194, 3 2020. ISBN 9781713821120.
Shaveta Dargan, Munish Kumar, Maruthi Rohit Ayyagari, and Gulshan Kumar. A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning. Archives of Computational Methods in Engineering, 27 (4):1071-1092, 9 2019.
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting Adversarial Attacks with Momentum. In CVPR, pages 9185-9193, 10 2018. ISBN 9781538664209. doi: 10.1109/CVPR. 2018.00957. URL http://arxiv.org/abs/1710. 06081.
Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. ICLR, 2019.
Nial Friel and Jason Wyse. Estimating the evidence - a review. Statistica Neerlandica, 66(3):288-308, 8 2012. ISSN 00390402.
Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov, and Andrew Gordon Wilson. Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs. NeurIPS, pages 8789-8798, 2 2018.
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and Harnessing Adversarial Examples. 12 2014.
Kathrin Grosse, David Pfaff, Michael Thomas Smith, and Michael Backes. The Limitations of Model Uncertainty in Adversarial Settings. NeurIPS Workshop, 12 2018.
Yiwen Guo, Qizhang Li, and Hao Chen. Backpropagating Linearly Improves Transferability of Adversarial Examples. Technical report, 2020. URL https://github.com/qizhangli/linbp-attack.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity Mappings in Deep Residual Networks. Lecture Notes in Computer Science, 9908 LNCS:630-645, 3 2016.
Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, and Kilian Q. Weinberger. Snapshot ensembles: Train 1, get M for free. In ICLR. International Conference on Learning Representations, ICLR, 3 2017. URL http://arxiv.org/abs/1704.00109.
Qian Huang, Isay Katsman, Zeqi Gu, Horace He, Serge Belongie, and Ser Nam Lim. Enhancing adversarial example transferability with an intermediate level attack. In ICCV, pages 4732-4741, 7 2019. doi: 10.1109/ICCV. 2019.00483. URL http://arxiv.org/abs/1907. 10823.
Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. 2009. ISSN 1098-6596. doi: 10.1.1. 222.9220.
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. Adversarial examples in the physical world. In ICLR Workshop, 7 2019. URL http://arxiv.org/abs/1607.02533.
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. NeurIPS, pages 6403-6414, 2016.
Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, and Alan Yuille. Learning Transferable Adversarial Examples via Ghost Networks. AAAI, 34(07), 12 2018. ISSN 2374-3468. doi: 10.1609/aaai.v34i07.6810. URL http://arxiv.org/abs/1812.03413.
Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, and John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. In ICLR, 8 2020.
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. ICLR, 11 2017. URL https://arxiv.org/abs/1611.02770.
Wesley J. Maddox, Timur Garipov, Izmailov, Dmitry Vetrov, and Andrew Gordon Wilson. A simple baseline for Bayesian uncertainty in deep learning. In NeurIPS, volume 32. arXiv, 2 2019. URL http://arxiv.org/abs/1902.02476.
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In ICLR, 6 2018. URL http://arxiv.org/abs/1706.06083.
Stephan Mandt, Matthew D. Hof Fman, and David M. Blei. Stochastic Gradient Descent as Approximate Bayesian Inference. Journal of Machine Learning Research, 18: 1-35, 4 2017. ISSN 15337928. URL https://arxiv.org/abs/1704.04289v2.
Chris Mingard, Guillermo Valle-Pérez, Joar Skalse, and Ard A. Louis. Is SGD a Bayesian sampler? Well, almost. Journal of Machine Learning Research, 22, 6 2020. ISSN 15337928.
Minka Thomas P. Bayesian model averaging is not model combination. Technical report, 2002.
Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, Ian Molloy, and Ben Edwards. Adversarial Robustness Toolbox v1.2.0. CoRR, 1807.01069, 2018.
Henri Palacci and Henry Hess. Scalable Natural Gradient Langevin Dynamics in Practice. 6 2018. URL http://arxiv.org/abs/1806.02855.
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples. 5 2016. URL http://arxiv.org/abs/1605.07277.
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS, pages 8024-8035. Curran Associates, Inc., 2019.
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. IJCV, 115(3):211-252, 2015.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. 12 2013.
Mingxing Tan and Quoc V. Le. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML, pages 10691-10700, 5 2019.
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. Mnas-Net: Platform-Aware Neural Architecture Search for Mobile. ICCV, 2018.
Kuan Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, and Richard Zemel. Adversarial distillation of Bayesian neural network posteriors. In ICML, volume 12, pages 8239-8248, 6 2018. ISBN 9781510867963. URL http://arxiv.org/abs/1806.10317.
Max Welling and Yee Whye Teh. Bayesian learning via stochastic gradient langevin dynamics. In ICML 2011, pages 681-688, 2011. ISBN 9781450306195.
Dongxian Wu, Yisen Wang, Shu-Tao Xia, James Bailey, and Xingjun Ma. Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets. ICLR, 2 2020.
Cihang Xie, Zhishuai Zhang, Alan L. Yuille, Jianyu Wang, and Zhou Ren. Mitigating adversarial effects through randomization. In ICLR. arXiv, 11 2018.
Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L. Yuille. Improving transferability of adversarial examples with input diversity. In ICCV, 3 2019. ISBN 9781728132938. doi: 10.1109/CVPR.2019.00284. URL http://arxiv.org/abs/1803.06978.
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In CVPR, pages 5987-5995, 11 2017.
Sergey Zagoruyko and Nikos Komodakis. Wide Residual Networks. British Machine Vision Conference 2016, BMVC 2016, 2016-Septe:1-87, 5 2016.
Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, and Andrew Gordon Wilson. Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning. ICLR, 2 2020. URL http://arxiv.org/abs/1902.03932.