Active learning; Adversarial robustness; Deep learning testing; Image classification; Active Learning; Data Selection; Density-based; Images classification; Learning models; Performance; Software developer; State of the art; Software; Artificial Intelligence
Abstract :
[en] Active learning helps software developers reduce the labeling cost when building high-quality machine learning models. A core component of active learning is the acquisition function that determines which data should be selected to annotate.State-of-the-art (SOTA) acquisition functions focus on clean performance (e.g. accuracy) but disregard robustness (an important quality property), leading to fragile models with negligible robustness (less than 0.20%). In this paper, we first propose to integrate adversarial training into active learning (adversarial-robust active learning, ARAL) to produce robust models. Our empirical study on 11 acquisition functions and 15105 trained deep neural networks (DNNs) shows that ARAL can produce models with robustness ranging from 2.35% to 63.85%. Our study also reveals, however, that the acquisition functions that perform well on accuracy are worse than random sampling when it comes to robustness. Via examining the reasons behind this, we devise the density-based robust sampling with entropy (DRE) to target both clean performance and robustness. The core idea of DRE is to maintain a balance between selected data and the entire set based on the entropy density distribution. DRE outperforms SOTA functions in terms of robustness by up to 24.40%, while remaining competitive on accuracy. Additionally, the in-depth evaluation shows that DRE is applicable as a test selection metric for model retraining and stands out from all compared functions by up to 8.21% robustness.
Disciplines :
Computer science
Author, co-author :
GUO, Yuejun ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > SerVal > Team Yves LE TRAON
HU, Qiang ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Papadakis, Michail; SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
Le Traon, Yves; SnT, University of Luxembourg, Esch-sur-Alzette, Luxembourg
External co-authors :
no
Language :
English
Title :
DRE: density-based data selection with entropy for adversarial-robust deep learning models
Publication date :
February 2023
Journal title :
Neural Computing and Applications
ISSN :
0941-0643
eISSN :
1433-3058
Publisher :
Springer Science and Business Media Deutschland GmbH
Yu S, Fang C, Yun Y, Feng Y (2021) Layout and image recognition driving cross-platform automated mobile testing. In: 43rd International Conference on Software Engineering, pp 1561– 1571. IEEE
Alahmadi M, Khormi A, Parajuli B, Hassel J, Haiduc S, Kumar P (2020) Code localization in programming screencasts. Empir Softw Eng 25(2):1536–1572 DOI: 10.1007/s10664-019-09759-w
Wang J, Chen J, Sun Y, Ma X, Wang D, Sun J, Cheng P (2021) Robot: robustness-oriented testing for deep learning systems, pp 300– 311
Ducoffe M, Precioso F (2018) Adversarial active learning for deep networks: a margin based approach (2018)
Settles B, Craven M, Friedland L (2008) Active learning with real annotation costs. In: NIPS Workshop on Cost-Sensitive Learning, 1
Lu H, Kocaguneli E, Cukic B (2014) Defect prediction between software versions with active learning and dimensionality reduction. In: 25th International Symposium on Software Reliability Engineering, pp 312– 322
Karlos S, Aridas C, Kanas VG, Kotsiantis S (2021) Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes. Neural Computing and Applications, pp 1–18
Sener O, Savarese S (2018) Active learning for convolutional neural networks: a core-set approach. In: International Conference on Learning Representations
Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Conference on Empirical Methods in Natural Language Processing, pp 1070– 1079. Association for Computational Linguistics, USA
Bojarski M, Del Testa D, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel LD, Monfort M, Muller U, Zhang J et al (2016) End to end learning for self-driving cars
Yuan Z, Lu Y, Wang Z, Xue Y (2014) Droid-sec: deep learning in android malware detection. In: ACM Conference on SIGCOMM, pp 371– 372. Association for Computing Machinery, New York, USA
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. In: International Conference on Learning Representations
Croce F, Andriushchenko M, Sehwag V, Debenedetti E, Flammarion N, Chiang M, Mittal P, Hein M (2020) RobustBench: a standardized adversarial robustness benchmark
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations, Vancouver, Canada
Ren K, Zheng T, Qin Z, Liu X (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360 DOI: 10.1016/j.eng.2019.12.012
Tong S (2001) Active learning: theory and applications. In: PhD thesis, Stanford University
Settles B (2010) Active learning literature survey. Technical Report 1648, University of Wisconsin, Madison
Angluin D (1988) Queries and concept learning. Mach Learn 2(4):319–342 DOI: 10.1007/BF00116828
Atlas LE, Cohn DA, Ladner RE (1990) Training connectionist networks with queries and selective sampling. In: Advances in Neural Information Processing Systems, pp 566– 573. Citeseer
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp 3– 12. Springer
Ramirez-Loaiza ME, Sharma M, Kumar G, Bilgic M (2017) Active learning: an empirical study of common baselines. Data Min Knowl Discov 31(2):287–313 DOI: 10.1007/s10618-016-0469-7
Pereira-Santos D, Prudêncio RBC, de Carvalho AC (2019) Empirical investigation of active learning strategies. Neurocomputing 326:15–27 DOI: 10.1016/j.neucom.2017.05.105
Sassano M (2002) An empirical study of active learning with support vector machines for japanese word segmentation. In: 40th Annual Meeting of the Association for Computational Linguistics, pp 505– 512
Prabhu A, Dognin C, Singh M (2019) Sampling bias in deep active classification: an empirical study. In: Conference on Empirical Methods in Natural Language Processing, pp 4049– 4059
Chen J, Schein A, Ungar L, Palmer M (2006) An empirical study of the behavior of active learning for word sense disambiguation. In: Human Language Technology Conference of the NAACL, Main Conference, pp 120– 127
Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Conference on Empirical Methods in Natural Language Processing, pp 1070– 1079. Association for Computational Linguistics, USA
Heilbron FC, Lee J-Y, Jin H, Ghanem B (2018) What do i annotate next? an empirical study of active learning for action localization. In: European Conference on Computer Vision, pp. 199– 216. Springer, Germany
Bowring JF, Rehg JM, Harrold MJ (2004) Active learning for automatic classification of software behavior. In: ACM SIGSOFT International Symposium on Software Testing and Analysis, pp 195– 205. Association for Computing Machinery, New York, USA
Kocaguneli E, Menzies T, Keung J, Cok D, Madachy R (2013) Active learning and effort estimation: finding the essential content of software effort estimation data. IEEE Trans Softw Eng 39(8):1040–1053 DOI: 10.1109/TSE.2012.88
Yu Z, Kraft NA, Menzies T (2018) Finding better active learners for faster literature reviews. Empir Softw Eng 23(6):3161–3186 DOI: 10.1007/s10664-017-9587-0
Cambronero JP, Dang THY, Vasilakis N, Shen J, Wu J, Rinard MC (2019) Active learning for software engineering. In: ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software. Association for Computing Machinery, New York, USA
Yu Z, Theisen C, Williams L, Menzies T (2019) Improving vulnerability inspection efficiency using active learning. In: IEEE Transactions on Software Engineering (Early Access), 1–1
Yang X, Yu Z, Wang J, Menzies T (2021) Understanding static code warnings: an incremental ai approach. Expert Syst Appl 167:114134 DOI: 10.1016/j.eswa.2020.114134
Lu H, Cukic B (2012) An adaptive approach with active learning in software fault prediction. In: 8th International Conference on Predictive Models in Software Engineering, pp 79– 88. Association for Computing Machinery, New York, USA
Tu H, Yu Z, Menzies T (2020) Better data labelling with emblem (and how that impacts defect prediction). In: IEEE Transactions on Software Engineering, 1–1
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: International Conference on Learning Representations
Ilyas A, Santurkar S, Tsipras D, Engstrom L, Tran B, Madry A (2019) Adversarial examples are not bugs, they are features. In: Advances in Neural Information Processing Systems, Vancouver, BC, Canada, pp 125– 136
Schmidt L, Santurkar S, Tsipras D, Talwar K, Madry A (2018) Adversarially robust generalization requires more data. In: 32nd International Conference on Neural Information Processing Systems, pp 5019– 5031. Curran Associates Inc., Red Hook, USA
Andriushchenko M, Croce F, Flammarion N, Hein M (2020) Square attack: a query-efficient black-box adversarial attack via random search. In: European Conference on Computer Vision, pp 484– 501. Springer, Germany
Brendel W, Rauber J, Bethge M (2018) Decision-based adversarial attacks: reliable attacks against black-box machine learning models. In: International Conference on Learning Representations
Bhagoji,AN, He W, Li B, Song D (2018) Practical black-box attacks on deep neural networks using efficient query mechanisms. In: European Conference on Computer Vision. Springer, Germany
Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on Machine Learning, pp 2206– 2216. PMLR
Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: protecting classifiers against adversarial attacks using generative models. In: International Conference on Learning Representations
Guo C, Rana M, Cisse M, van der Maaten L (2018) Countering adversarial images using input transformations. In: International Conference on Learning Representations
Xie C, Wang J, Zhang Z, Ren Z, Yuille A (2018) Mitigating adversarial effects through randomization. In: International Conference on Learning Representations
Papernot N, McDaniel P, Wu X, Jha S, Swami A (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: IEEE Symposium on Security and Privacy, pp 582– 597. Institute of Electrical and Electronics Engineers Inc., San Jose, United States
Warde-Farley D, Goodfellow I (2016) 11 adversarial perturbations of deep neural networks. 311
Chen J, Wu Z, Wang Z, You H, Zhang L, Yan M (2020) Practical accuracy estimation for efficient deep neural network testing. ACM Trans Softw Eng Method, 29(4)
Li Z, Ma X, Xu C, Cao C, Xu J, Lü J (2019) Boosting operational dnn testing efficiency through conditioning, pp 499– 509. Assoc Comput Mach, New York, USA
Wang Z, You H, Chen J, Zhang Y, Dong X, Zhang W (2021) Prioritizing test inputs for deep neural networks via mutation analysis. In: 43rd International Conference on Software Engineering, pp 397– 409
Feng Y, Shi Q, Gao X, Wan J, Fang C, Chen Z (2020) Deepgini: prioritizing massive tests to enhance the robustness of deep neural networks. In: 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp 177– 188. Association for Computing Machinery, New York, USA
Ma W, Papadakis M, Tsakmalis A, Cordy M, Traon YL (2021) Test selection for deep learning systems. ACM Trans Softw Eng Method 30(2):13–11322 DOI: 10.1145/3417330
Meng L, Li Y, Chen L, Wang Z, Wu D, Zhou Y, Xu B (2021) Measuring discrimination to boost comparative testing for multiple deep learning models. In: 43rd International Conference on Software Engineering, pp 385– 396
Shen W, Li Y, Chen L, Han Y, Zhou Y, Xu B (2020) Multiple-boundary clustering and prioritization to promote neural network retraining. In: International Conference on Automated Software Engineering, pp 410– 422. Association for Computing Machinery, New York, United States
Guo Y (2021) Project website of robust active learning. https://sites.google.com/view/robust-al/
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324 DOI: 10.1109/5.726791
Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A (2011) Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto, Toronto
Mayer C, Timofte R (2020) Adversarial sampling for active learning. In: IEEE Winter Conference on Applications of Computer Vision, pp 3060– 3068
Athalye A, Carlini N, Wagner D (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. ICML 80:274–283
Kim H (2020) Torchattacks: a pytorch repository for adversarial attacks
Wald A (1945) Statistical decision functions which minimize the maximum risk. Ann Math 46:265–280 DOI: 10.2307/1969022
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423 DOI: 10.1002/j.1538-7305.1948.tb01338.x
Houlsby N, Huszár F, Ghahramani Z, Lengyel M (2011) Bayesian active learning for classification and preference learning
Gal Y, Islam R, Ghahramani Z (2017) Deep bayesian active learning with image data. In: 34th International Conference on Machine Learning, pp 1183– 1192. JMLR.org, Sydney, NSW, Australia
Scheffer T, Decomain C, Wrobel S (2001) Active hidden markov models for information extraction. Adv Intell Data Anal, pp 309– 318. Springer, Berlin, Heidelberg
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83 DOI: 10.2307/3001968
Silverman BW (1998) Density estimation for statistics and data analysis, 1st edn. Routledge, New York
Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151 DOI: 10.1109/18.61115
Student (1908) Probable error of a correlation coefficient. Biometrika 6(2/3):302–310 DOI: 10.2307/2331474
Pei K, Cao Y, Yang J, Jana S (2017) Deepxplore: automated whitebox testing of deep learning systems. In: 26th Symposium on Operating Systems Principles, pp 1– 18. Association for Computing Machinery, New York, USA
Ren H, Huang T (2020) Adversarial example attacks in the physical world. In: International Conference on Machine Learning for Cyber Security, pp 572– 582. Springer, Cham