Fooling machine learning models: a novel out-of-distribution attack through generative adversarial networks

Hu, Hailong; PANG, Jun

doi:10.1007/s10489-024-05974-1

Download

Article (Scientific journals)

Fooling machine learning models: a novel out-of-distribution attack through generative adversarial networks

Hu, Hailong; PANG, Jun

2025 • In Applied Intelligence, 55 (5)

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/10993/64446

DOI
10.1007/s10489-024-05974-1

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

s10489-024-05974-1.pdf

Publisher postprint (1.23 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Out-of-distribution attacks; Out-of-distribution detection; Generative adversarial networks; Robustness in machine learning

Disciplines :

Computer science

Author, co-author :

Hu, Hailong ; Chongqing Technology and Business University

PANG, Jun ; University of Luxembourg

External co-authors :

yes

Language :

English

Title :

Fooling machine learning models: a novel out-of-distribution attack through generative adversarial networks

Publication date :

2025

Journal title :

Applied Intelligence

ISSN :

0924-669X

eISSN :

1573-7497

Publisher :

Springer Science and Business Media LLC

Volume :

Issue :

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

https://link.springer.com/content/pdf/10.1007/s10489-024-05974-1.pdf

Funders :

Fonds National de la Recherche Luxembourg

Available on ORBilu :

since 11 March 2025

Statistics

Number of views

10 (0 by Unilu)

Number of downloads

6 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

WoS citations^™

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of annual conference on Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., pp 2672–2680
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 4700–4708
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), pp 4171–4186
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: Proceedings of annual conference on Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., pp 1877–1901
J. Lee W. Yoon S. Kim D. Kim S. Kim C.H. So J. Kang BioBERT: a pre-trained biomedical language representation model for biomedical text mining Bioinformatics 36 4 1234 1240 10.1093/bioinformatics/btz682 07827646
Demontis A, Melis M, Pintor M, Jagielski M, Biggio B, Oprea A, Nita-Rotaru C, Roli F (2019) Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks. In: Proceedings of USENIX security symposium (USENIX Security). USENIX Association, pp 321–338
Cao Y, Xiao C, Cyr B, Zhou Y, Park W, Rampazzi S, Chen QA, Fu K, Mao ZM (2019) Adversarial sensor attack on lidar-based perception in autonomous driving. In: Proceedings of ACM SIGSAC conference on Computer and Communications Security (CCS). ACM, pp 2267–2281
Ji Y, Zhang X, Ji S, Luo X, Wang T (2018) Model-reuse attacks on deep learning systems. In: Proceedings of ACM SIGSAC conference on Computer and Communications Security (CCS). ACM, pp 349–363
Pang R, Shen H, Zhang X, Ji S, Vorobeychik Y, Luo X, Liu A, Wang T (2020) A tale of evil twins: adversarial inputs versus poisoned models. In: Proceedings of ACM SIGSAC conference on Computer and Communications Security (CCS). ACM, pp 85–99
Pei K, Cao Y, Yang J, Jana S (2017) Deepxplore: automated whitebox testing of deep learning systems. In: Proceedings of Symposium on Operating Systems Principles (SOSP). ACM, pp 1–18
Fang Z, Li Y, Lu J, Dong J, Han B, Liu F (2022) Is out-of-distribution detection learnable? In: Proceedings of annual conference on Neural Information Processing Systems (NeurIPS). Curran Associates, Inc
Biggio B, Corona I, Maiorca D, Nelson B, Šrndić N, Laskov P, Giacinto G, Roli F (2013) Evasion attacks against machine learning at test time. In: Proceedings of joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD). Springer, pp 387–402
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: Proceedings of International Conference on Learning Representations (ICLR)
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: Proceedings of International Conference on Learning Representations (ICLR)
Brendel W, Rauber J, Bethge M (2018) Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In: Proceedings of International Conference on Learning Representations (ICLR)
Chen J, Jordan MI, Wainwright MJ (2020) Hopskipjumpattack: a query-efficient decision-based attack. In: Proceedings of IEEE symposium on Security and Privacy (SP). IEEE, pp 1277–1294
Nguyen A, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 427–436
Hein M, Andriushchenko M, Bitterwolf J (2019) Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 41–50
Meinke A, Hein M (2020) Towards neural networks that provably know when they don’t know. In: Proceedings of International Conference on Learning Representations (ICLR)
Hendrycks D, Gimpel K (2017) A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: Proceedings of International Conference on Learning Representations (ICLR)
Sehwag V, Bhagoji AN, Song L, Sitawarin C, Cullina D, Chiang M, Mittal P (2019) Analyzing the robustness of open-world machine learning. In: Proceedings of ACM workshop on artificial intelligence and security. ACM, pp 105–116
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: Proceedings of IEEE symposium on Security and Privacy (SP). IEEE, pp 39–57
Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: Proceedings of International Conference on Machine Learning (ICML). PMLR, pp 2206–2216
Yamamura K, Sato H, Tateiwa N, Hata N, Mitsutake T, Oe I, Ishikura H, Fujisawa K (2022) Diversified adversarial attacks based on conjugate gradient method. In: Proceedings of International Conference on Machine Learning (ICML). PMLR, pp 24872–24894
Chen P-Y, Zhang H, Sharma Y, Yi J, Hsieh C-J (2017) Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of ACM workshop on artificial intelligence and security. ACM, pp 15–26
Cheng M, Singh S, Chen P, Chen P-Y, Liu S, Hsieh C-J (2020) Sign-opt: a query-efficient hard-label adversarial attack. In: Proceedings of International Conference on Learning Representations (ICLR)
Liang S, Li Y, Srikant R (2018) Enhancing the reliability of out-of-distribution image detection in neural networks. In: Proceedings of International Conference on Learning Representations (ICLR)
Hendrycks D, Mazeika M, Dietterich T (2019) Deep anomaly detection with outlier exposure. In: Proceedings of International Conference on Learning Representations (ICLR)
Chen J, Li Y, Wu X, Liang Y, Jha S (2021) Atom: robustifying out-of-distribution detection using outlier mining. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD)
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR)
Baluja S, Fischer I (2018) Learning to attack: adversarial transformation networks. In: Proceedings of AAAI conference on artificial intelligence (AAAI). AAAI
Xiao C, Li B, Zhu J-Y, He W, Liu M, Song D (2018) Generating adversarial examples with adversarial networks. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp 3905–3911
Song Y, Shu R, Kushman N, Ermon S (2018) Constructing unrestricted adversarial examples with generative models. In: Proceedings of annual conference on Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., pp 8322–8333
Yang J, Zhou K, Li Y, Liu Z (2021) Generalized out-of-distribution detection: a survey. arXiv preprint arXiv:2110.11334
Shen Z, Liu J, He Y, Zhang X, Xu R, Yu H, Cui P (2021) Towards out-of-distribution generalization: a survey. arXiv preprint arXiv:2108.13624
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of International Conference on Learning Representations (ICLR)
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of International Conference on Machine Learning (ICML). PMLR, pp 214–223
Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: Proceedings of International Conference on Learning Representations (ICLR)
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: Proceedings of International Conference on Learning Representations (ICLR)
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: Proceedings of International Conference on Learning Representations (ICLR)
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 8110–8119
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 4401–4410
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat, 400–407
M.J. Powell An efficient method for finding the minimum of a function of several variables without calculating derivatives Comput J 7 2 155 162 187376 10.1093/comjnl/7.2.155 0132.11702
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto
J. Stallkamp M. Schlipsing J. Salmen C. Igel Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition Neural Netw 32 323 332 10.1016/j.neunet.2012.02.016
Zheng Y, Zhao Y, Ren M, Yan H, Lu X, Liu J, Li J (2020) Cartoon face recognition: a benchmark dataset. In: Proceedings of ACM international conference on multimedia (MM). ACM, pp 2264–2272
O. Russakovsky J. Deng H. Su J. Krause S. Satheesh S. Ma Z. Huang A. Karpathy A. Khosla M. Bernstein A.C. Berg L. Fei-Fei Imagenet large scale visual recognition challenge Int J Comput Vision 115 3 211 252 3422482 10.1007/s11263-015-0816-y
Zagoruyko S, Komodakis N (2016) Wide residual networks. In: Proceedings of the British Machine Vision Conference (BMVC)
Carlini N, Wagner D (2017) Adversarial examples are not easily detected: bypassing ten detection methods. In: Proceedings of the ACM workshop on Artificial Intelligence and Security (AISec). ACM, pp 3–14
Tramer F, Carlini N, Brendel W, Madry A (2020) On adaptive attacks to adversarial example defenses. In: Proceedings of annual conference on Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., pp 1633–1645
ART (2018) Adversarial Robustness Toolbox (ART). https://github.com/Trusted-AI/adversarial-robustness-toolbox
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Yu F, Seff A, Zhang Y, Song S, Funkhouser T, Xiao J (2015) LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (ICLR)