Computer Vision: Adversarial learning, adversarial attack and defense methods; Constraint Satisfaction and Optimization: Constraints and Machine Learning; Constraint Satisfaction and Optimization: Constraint Satisfaction; Constraint Satisfaction and Optimization: Constraint Optimization; Search: Evolutionary Computation
Résumé :
[en] The generation of feasible adversarial examples is necessary for properly assessing models that work in constrained feature space. However, it remains a challenging task to enforce constraints into attacks that were designed for computer vision. We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework can handle both linear and non-linear constraints. We instantiate our framework into two algorithms: a gradient-based attack that introduces constraints in the loss function to maximize, and a multi-objective search algorithm that aims for misclassification, perturbation minimization, and constraint satisfaction. We show that our approach is effective in four different domains, with a success rate of up to 100%, where state-of-the-art attacks fail to generate a single feasible example. In addition to adversarial retraining, we propose to introduce engineered non-convex constraints to improve model adversarial robustness. We demonstrate that this new defense is as effective as adversarial retraining. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that future research can exploit.
Centre de recherche :
Interdisciplinary Centre for Security, Reliability and Trust (SnT)
Disciplines :
Sciences informatiques
Auteur, co-auteur :
SIMONETTO, Thibault Jean Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
DYRMISHI, Salijona ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
GHAMIZI, Salah ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space
Date de publication/diffusion :
2022
Nom de la manifestation :
INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
Date de la manifestation :
from 23-07-2022 to 29-07-2022
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22
Maison d'édition :
International Joint Conferences on Artificial Intelligence Organization
Hojjat Aghakhani, Fabio Gritti, Francesco Mecca, Martina Lindorfer, Stefano Ortolani, Davide Balzarotti, Giovanni Vigna, and Christopher Kruegel. When malware is packin'heat; limits of machine learning classifiers based on static analysis features. In Network and Distributed Systems Security (NDSS) Symposium 2020, 2020.
Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998, 2018.
Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, D. Tsipras, I. Goodfellow, A. Madry, and A. Kurakin. On evaluating adversarial robustness. ArXiv, abs/1902.06705, 2019.
Alesia Chernikova and Alina Oprea. Fence: Feasible evasion attacks on neural networks in constrained environments. arXiv preprint arXiv:1909.10480, 2019.
Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, and Deepak Verma. Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 99-108, 2004.
Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, pages 357-369, 2020.
Ecenaz Erdemir, Jeffrey Bickford, Luca Melis, and Sergul Aydore. Adversarial robustness with nonuniform perturbations. arXiv preprint arXiv:2102.12002, 2021.
Salah Ghamizi, Maxime Cordy, Martin Gubri, Mike Papadakis, Andrey Boystov, Yves Le Traon, and Anne Goujon. Search-based adversarial testing and improvement of constrained credit scoring systems. In Proc. of ESEC/FSE'20, pages 1089-1100, 2020.
Salah Ghamizi, Maxime Cordy, Mike Papadakis, and Yves Le Traon. Adversarial robustness in multi-task learning: Promises and illusions. arXiv preprint arXiv:2110.15053, 2021.
Abdelhakim Hannousse and Salima Yahiouche. Towards benchmark datasets for machine learning based website phishing detection: An experimental study. Engineering Applications of Artificial Intelligence, 104:104347, 2021.
Kaggle. All Lending Club loan data, 2019.
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
Jiangnan Li, Jin Young Lee, Yingyuan Yang, Jinyuan Stella Sun, and Kevin Tomsovic. Conaml: Constrained adversarial machine learning for cyber-physical systems. arXiv preprint arXiv:2003.05631, 2020.
Gautam Raj Mode and Khaza Anuarul Hoque. Crafting adversarial examples for deep learning based prognostics (extended version). arXiv preprint arXiv:2009.10149, 2020.
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.
Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. Intriguing properties of adversarial ml attacks in the problem space. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1332-1349. IEEE, 2020.
Meysam Sadeghi and Erik G Larsson. Physical adversarial attacks against end-to-end autoencoder communication systems. IEEE Communications Letters, 23:847-850, 2019.
Ryan Sheatsley, Nicolas Papernot, Michael Weisman, Gunjan Verma, and Patrick McDaniel. Adversarial examples in constrained domains. arXiv preprint arXiv:2011.01183, 2020.
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propagating activation differences. In International Conference on Machine Learning. PMLR, 2017.
Qingquan Song, Haifeng Jin, Xiao Huang, and Xia Hu. Multi-label adversarial perturbations. In 2018 IEEE International Conference on Data Mining (ICDM), pages 1242-1247. IEEE, 2018.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
Yunzhe Tian, Yingdi Wang, Endong Tong, Wenjia Niu, Liang Chang, Qi Alfred Chen, Gang Li, and Jiqiang Liu. Exploring data correlation between feature pairs for generating constraint-based adversarial examples. In 2020 IEEE 26th International Conference on Parallel and Distributed Systems (IC-PADS), pages 430-437. IEEE, 2020.
Yash Vesikar, Kalyanmoy Deb, and Julian Blank. Reference point based nsga-iii for preferred solutions. In 2018 IEEE symposium series on computational intelligence (SSCI), pages 1587-1594. IEEE, 2018.
Noam Yefet, Uri Alon, and Eran Yahav. Adversarial examples for models of code. Proceedings of the ACM on Programming Languages, 4:1-30, 2020.