A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

SIMONETTO, Thibault Jean Angel; DYRMISHI, Salijona; GHAMIZI, Salah; CORDY, Maxime; LE TRAON, Yves

doi:10.24963/ijcai.2022/183

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

SIMONETTO, Thibault Jean Angel; DYRMISHI, Salijona; GHAMIZI, Salah et al.

2022 • In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22

Peer reviewed

Permalink
https://hdl.handle.net/10993/53045

DOI
10.24963/ijcai.2022/183

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

2112.01156.pdf

Author preprint (377.49 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Computer Vision: Adversarial learning, adversarial attack and defense methods; Constraint Satisfaction and Optimization: Constraints and Machine Learning; Constraint Satisfaction and Optimization: Constraint Satisfaction; Constraint Satisfaction and Optimization: Constraint Optimization; Search: Evolutionary Computation

Abstract :

[en] The generation of feasible adversarial examples is necessary for properly assessing models that work in constrained feature space. However, it remains a challenging task to enforce constraints into attacks that were designed for computer vision. We propose a unified framework to generate feasible adversarial examples that satisfy given domain constraints. Our framework can handle both linear and non-linear constraints. We instantiate our framework into two algorithms: a gradient-based attack that introduces constraints in the loss function to maximize, and a multi-objective search algorithm that aims for misclassification, perturbation minimization, and constraint satisfaction. We show that our approach is effective in four different domains, with a success rate of up to 100%, where state-of-the-art attacks fail to generate a single feasible example. In addition to adversarial retraining, we propose to introduce engineered non-convex constraints to improve model adversarial robustness. We demonstrate that this new defense is as effective as adversarial retraining. Our framework forms the starting point for research on constrained adversarial attacks and provides relevant baselines and datasets that future research can exploit.

Research center :

Interdisciplinary Centre for Security, Reliability and Trust (SnT)

Disciplines :

Computer science

Author, co-author :

SIMONETTO, Thibault Jean Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

DYRMISHI, Salijona ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

GHAMIZI, Salah ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)

CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

External co-authors :

Language :

English

Title :

A Unified Framework for Adversarial Attack and Defense in Constrained Feature Space

Publication date :

2022

Event name :

INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE

Event date :

from 23-07-2022 to 29-07-2022

Audience :

International

Main work title :

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22

Publisher :

International Joint Conferences on Artificial Intelligence Organization

ISBN/EAN :

978-1-956792-00-3

Pages :

1313-1319

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Additional URL :

https://doi.org/10.24963/ijcai.2022/183

FnR Project :

FNR14585105 - Search-based Adversarial Testing Under Domain-specific Constraints, 2020 (01/10/2020-30/09/2024) - Salijona Dyrmishi

Available on ORBilu :

since 12 December 2022

Statistics

Number of views

105 (11 by Unilu)

Number of downloads

39 (6 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Hojjat Aghakhani, Fabio Gritti, Francesco Mecca, Martina Lindorfer, Stefano Ortolani, Davide Balzarotti, Giovanni Vigna, and Christopher Kruegel. When malware is packin'heat; limits of machine learning classifiers based on static analysis features. In Network and Distributed Systems Security (NDSS) Symposium 2020, 2020.
Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998, 2018.
Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, D. Tsipras, I. Goodfellow, A. Madry, and A. Kurakin. On evaluating adversarial robustness. ArXiv, abs/1902.06705, 2019.
Alesia Chernikova and Alina Oprea. Fence: Feasible evasion attacks on neural networks in constrained environments. arXiv preprint arXiv:1909.10480, 2019.
Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, and Deepak Verma. Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 99-108, 2004.
Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security, pages 357-369, 2020.
Ecenaz Erdemir, Jeffrey Bickford, Luca Melis, and Sergul Aydore. Adversarial robustness with nonuniform perturbations. arXiv preprint arXiv:2102.12002, 2021.
Salah Ghamizi, Maxime Cordy, Martin Gubri, Mike Papadakis, Andrey Boystov, Yves Le Traon, and Anne Goujon. Search-based adversarial testing and improvement of constrained credit scoring systems. In Proc. of ESEC/FSE'20, pages 1089-1100, 2020.
Salah Ghamizi, Maxime Cordy, Mike Papadakis, and Yves Le Traon. Adversarial robustness in multi-task learning: Promises and illusions. arXiv preprint arXiv:2110.15053, 2021.
Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2022. https://www.gurobi.com/wp-content/plugins/hddocumentations/documentation/9.5/refman.pdf.
Abdelhakim Hannousse and Salima Yahiouche. Towards benchmark datasets for machine learning based website phishing detection: An experimental study. Engineering Applications of Artificial Intelligence, 104:104347, 2021.
Kaggle. All Lending Club loan data, 2019.
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
Jiangnan Li, Jin Young Lee, Yingyuan Yang, Jinyuan Stella Sun, and Kevin Tomsovic. Conaml: Constrained adversarial machine learning for cyber-physical systems. arXiv preprint arXiv:2003.05631, 2020.
Gautam Raj Mode and Khaza Anuarul Hoque. Crafting adversarial examples for deep learning based prognostics (extended version). arXiv preprint arXiv:2009.10149, 2020.
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.
Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. Intriguing properties of adversarial ml attacks in the problem space. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1332-1349. IEEE, 2020.
Meysam Sadeghi and Erik G Larsson. Physical adversarial attacks against end-to-end autoencoder communication systems. IEEE Communications Letters, 23:847-850, 2019.
Ryan Sheatsley, Nicolas Papernot, Michael Weisman, Gunjan Verma, and Patrick McDaniel. Adversarial examples in constrained domains. arXiv preprint arXiv:2011.01183, 2020.
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propagating activation differences. In International Conference on Machine Learning. PMLR, 2017.
Qingquan Song, Haifeng Jin, Xiao Huang, and Xia Hu. Multi-label adversarial perturbations. In 2018 IEEE International Conference on Data Mining (ICDM), pages 1242-1247. IEEE, 2018.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
Yunzhe Tian, Yingdi Wang, Endong Tong, Wenjia Niu, Liang Chang, Qi Alfred Chen, Gang Li, and Jiqiang Liu. Exploring data correlation between feature pairs for generating constraint-based adversarial examples. In 2020 IEEE 26th International Conference on Parallel and Distributed Systems (IC-PADS), pages 430-437. IEEE, 2020.
Yash Vesikar, Kalyanmoy Deb, and Julian Blank. Reference point based nsga-iii for preferred solutions. In 2018 IEEE symposium series on computational intelligence (SSCI), pages 1587-1594. IEEE, 2018.
Noam Yefet, Uri Alon, and Eran Yahav. Adversarial examples for models of code. Proceedings of the ACM on Programming Languages, 4:1-30, 2020.