[en] Deep neural networks (DNNs) have demonstrated superior performance over classical machine learning to support many features in safety-critical systems. Although DNNs are now widely used in such systems (e.g., self driving cars), there is limited progress regarding automated support for functional safety analysis in DNN-based systems. For example, the identification of root causes of errors, to enable both risk analysis and DNN retraining, remains an open problem. In this paper, we propose SAFE, a black-box approach to automatically characterize the root causes of DNN errors. SAFE relies on a transfer learning model pre-trained on ImageNet to extract the features from error-inducing images. It then applies a density-based clustering algorithm to detect arbitrary shaped clusters of images modeling plausible causes of error. Last, clusters are used to effectively retrain and improve the DNN. The black-box nature of SAFE is motivated by our objective not to require changes or even access to the DNN internals to facilitate adoption.
Experimental results show the superior ability of SAFE in identifying different root causes of DNN errors based on case studies in the automotive domain. It also yields significant improvements in DNN accuracy after retraining, while saving significant execution time and memory when compared to alternatives.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab)
Disciplines :
Computer science
Author, co-author :
ATTAOUI, Mohammed Oualid ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
FAHMY, Hazem ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
PASTORE, Fabrizio ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
BRIAND, Lionel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
External co-authors :
yes
Language :
English
Title :
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction and Clustering
Publication date :
July 2022
Journal title :
ACM Transactions on Software Engineering and Methodology
ISSN :
1049-331X
Publisher :
Association for Computing Machinery (ACM), United States
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bibliography
Raja Ben Abdessalem, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2018. Testing vision-based control systems using learnable evolutionary algorithms. In Proceedings of the 2018 IEEE/ACM40th International Conference on Software Engineering. IEEE, 1016-1026.
Adel Alaeddini and Ibrahim Dogan. 2011. Using Bayesian networks for root cause analysis in statistical process control. Expert Systems with Applications 38, 9 (2011), 11230-11243.
Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi. 2017. Understanding of a convolutional neural network. In Proceedings of the 2017 International Conference on Engineering and Technology. IEEE, 1-6.
Maximilian Alber, Sebastian Lapuschkin, Philipp Seegerer, Miriam Hägele, Kristof T. Schött, Grégoire Montavon, Wojciech Samek, Klaus-Robert Möller, Sven Dähne, and Pieter-Jan Kindermans. 2019. iNNvestigate neural networks! Journal of Machine Learning Research 20, 93 (2019), 1-8. Retrieved from http://jmlr.org/papers/v20/18-540.html.
Authors of this paper. 2022. SAFE: Toolset and replicability package. Retrieved 2022 from https://zenodo.org/record/6619279.
Purnima Bholowalia and Arvind Kumar. 2014. EBK-means: A clustering technique based on elbow method and kmeans in WSN. International Journal of Computer Applications 105, 9 (2014), 17-24.
Piotr Dabkowski and Yarin Gal. 2017. Real time image saliency for black box classifiers. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, 6970-6979.
David L. Davies and Donald W. Bouldin. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 2, 2 (1979), 224-227.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-Training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171-4186, https://aclanthology.org/N19-1423.
Nassima Dif, Mohammed Oualid Attaoui, Zakaria Elberrichi, Mustapha Lebbah, and Hanene Azzag. 2021. Transfer learning from synthetic labels for histopathological images classification. Applied Intelligence 52 (2021), 1-20.
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. AAAI Press, Portland, Oregon, 226-231.
Hazem Fahmy, Fabrizio Pastore, Mojtaba Bagherzadeh, and Lionel Briand. 2021. Supporting deep neural network safety analysis and retraining through heatmap-based unsupervised learning. IEEE Transactions on Reliability 70, 4 (2021), 1-17. DOI:https://doi.org/10.1109/TR.2021.3074750
Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: Prioritizing massive tests to enhance the robustness of deep neural networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, New York, NY, 177-188. DOI:https://doi.org/10.1145/3395363.3397357
Hao Fu, Shanjiang Tang, Ce Yu, Yusen Li, Jizhou Sun, and Yanjie Liu. 2021. DVQShare: An analytics system for DNNbased video queries. In Proceedings of the 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing. IEEE, 166-175.
Rafael Garcia, Alexandru C. Telea, Bruno Castro da Silva, Jim Torresen, and Joao Luiz Dihl Comba. 2018. A task-Andtechnique centered survey on visual analytics for deep learning model engineering. Computers and Graphics 77 (2018), 30-49. DOI:https://doi.org/10.1016/j.cag.2018.09.018
Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. 2018. Explaining explanations: An overview of interpretability of machine learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics. IEEE, 80-89.
Ana Gómez-Andrades, Pablo Munoz, Inmaculada Serrano, and Raquel Barco. 2015. Automatic root cause analysis for LTE networks based on unsupervised techniques. IEEE Transactions on Vehicular Technology 65, 4 (2015), 2369-2386.
Alexander N. Gorban and Andrei Y. Zinovyev. 2010. Principal graphs and manifolds. In Proceedings of the Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. IGI Global, 28-59.
Gábor Gosztolya, Róbert Busa-Fekete, Tamás Grósz, and László Tóth. 2017. DNN-based feature extraction and classifier combination for child-directed speech, cold and snoring identification. In Proceeding of Interspeech 2017. International Speech Communication Association (ISCA), 3522-3526. https://doi.org/10.21437/Interspeech.2017-905.
Fitash Ul Haq, Donghwan Shin, Lionel C. Briand, Thomas Stifter, and Jun Wang. 2021. Automatic test suite generation for key-points detection DNNs using many-objective search (experience paper). In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, New York, NY, 91-102. DOI:https://doi.org/10.1145/3460319.3464802
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770-778.
Zhenzhen He, Yihai He, and YiWei. 2016. Big data oriented root cause identification approach based on PCA and SVM for product infant failure. In Proceedings of the 2016 Prognostics and System Health Management Conference. IEEE, 1-5.
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. Retrieved from https://arxiv.org/abs/1704.04861.
Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, and Xinping Yi. 2020. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Computer Science Review 37 (2020), 100270.
Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of Classification 2, 1 (1985), 193-218.
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the 42nd International Conference on Software Engineering. Association for Computing Machinery, New York, NY, 10.
IEE. 2022. IEE Sensing solutions. www.iee.lu.
INI. 2022. TRaffic Sign Dataset. Retrieved 2022 from http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset.
International Organization for Standardization. 2020. ISO, ISO-24765-2017, Systems and software engineering-Vocabulary.
International Organization for Standardization. 2020. ISO, ISO26262-1:2018, Road vehicles: Functional safety.
Jeya Vikranth Jeyakumar, Joseph Noor, Yu-Hsi Cheng, Luis Garcia, and Mani Srivastava. 2020. How can i explain this to you? An empirical study of deep neural network explanation methods. Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33, Curran Associates, Inc., 4211-4222.
Benish Kabir, Pamir, Ashraf Ullah, Shoaib Munawar, Muhammad Asif, and Nadeem Javaid. 2021. Detection of Non-Technical Losses Using MLP-GRU Based Neural Network to Secure Smart Grids. Complex, Intelligent and Software Intensive Systems, Leonard Barolli, Kangbin Yim, and Tomoya Enokido (Eds.). Springer International Publishing, Cham, 383-394.
Jinhan Kim, Jeongil Ju, Robert Feldt, and Shin Yoo. 2020. Reducing DNN labelling cost using surprise adequacy: An industrial case study for autonomous driving. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1466-1476.
Hans-Peter Kriegel, Peer Kröger, Jörg Sander, and Arthur Zimek. 2011. Density-based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1, 3 (2011), 231-240.
A. Krizhevsky and G. Hinton. 2009. Learning Multiple Layers of Features From Tiny Images. Technical Report. Department of Computer Science, University of Toronto.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM 60, 6 (2017), 84-90. DOI:https://doi.org/10.1145/3065386
Seokhyun Lee, Sooyoung Cha, Dain Lee, and Hakjoo Oh. 2020. Effective white-box testing of deep neural networks with adaptive neuron-selection strategy. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, New York, NY, 165-176. DOI:https://doi.org/10.1145/3395363.3397346
Yaguo Lei, Feng Jia, Jing Lin, Saibo Xing, and Steven X. Ding. 2016. An intelligent fault diagnosis method using unsupervised feature learning towards mechanical big data. IEEE Transactions on Industrial Electronics 63, 5 (2016), 3137-3147.
Zhong Li, Minxue Pan, Tian Zhang, and Xuandong Li. 2021. Testing DNN-based autonomous driving systems under critical environmental conditions. In Proceedings of the International Conference on Machine Learning. PMLR, 6471-6482.
Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. Computer Vision-ECCV 2014, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 740-755.
Johan Linaker, Sardar Muhammad Sulaman, Martin Höst, and Rafael Maiani de Mello. 2015. Guidelines for conducting surveys in software engineering v. 1.1. Lund University (2015). https://lucris.lub.lu.se/ws/files/6062997/5463412.pdf.
Z. Liu, P. Luo, X. Wang, and X. Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the 2015 IEEE International Conference on Computer Vision. 3730-3738.
Mohamed Loey, Gunasekaran Manogaran, Mohamed Hamed N. Taha, and Nour Eldeen M. Khalifa. 2021. A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic. Measurement 167 (2021), 108288.
Shiqing Ma, Yingqi Liu, Wen-Chuan Lee, Xiangyu Zhang, and Ananth Grama. 2018. MODE: Automated neural network model debugging via state differential analysis and input selection. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, New York, NY, 175-186. DOI:https://doi.org/10.1145/3236024.3236082
Levoy Marc, Marc Levoy, Rusinkiewicz Szymon, Weyrich Tim, Pfister Hanspeter, Amenta Nina, Wu Jianhua, Barthe Loïc, Zwicker Matthias, Kobbelt Leif, et al. 2007. 2-The early history of point-based graphics. In Proceedings of the Point-Based Graphics. Elsevier, 8-16.
Leland McInnes, John Healy, and Steve Astels. 2017. hdbscan: Hierarchical density based clustering. Journal of Open Source Software 2, 11 (2017), 205.
Leland McInnes, John Healy, and James Melville. 2020. UMAP: Uniform manifold approximation and projection for dimension reduction. (2020).
Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Wojciech Samek, and Klaus Robert Möller. 2019. Layer-Wise Relevance Propagation: An Overview. Springer International Publishing, Cham, 193-209. DOI:https://doi.org/10. 1007/978-3-030-28954-6_10
Rajaditya Mukherjee, Qingyang Li, Zhili Chen, Shicheng Chu, and Huamin Wang. 2018. Neuraldrop: DNN-based simulation of small-scale liquid flows on solids. arXiv:1811.02517. Retrieved from https://arxiv.org/abs/1811.02517.
Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In Proceedings of the Computer Vision, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 483-499.
Renjian Pan, Zhaobo Zhang, Xin Li, Krishnendu Chakrabarty, and Xinli Gu. 2021. Unsupervised two-stage root-cause analysis for integrated systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2021).
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science 2, 11 (1901), 559-572.
Vitali Petsiuk, Abir Das, and Kate Saenko. 2018. RISE: Randomized input sampling for explanation of black-boxmodels. In Proceedings of the British Machine Vision Conference.
PyTorch. 2022. PyTorch DNN framework. Retrieved 2022 from https://pytorch.org.
Nadia Rahmah and Imas Sukaesih Sitanggang. 2016. Determination of optimal epsilon (eps) value on dbscan algorithm to clustering data on peatland hotspots in sumatra. In Proceedings of the IOP Conference Series: Earth and Environmental Science. IOP Publishing, 012012.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-Agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence.
Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20 (1987), 53-65.
Madona B. Sahaai et al. 2021. Brain tumor detection using DNN algorithm. Turkish Journal of Computer and Mathematics Education 12, 11 (2021), 3338-3345.
Erich Schubert, Jörg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. 2017. DBSCAN revisited, revisited: Why and how you should (Still) use DBSCAN. ACM Trans. Database Syst. 42, 3 (2017), 21 pages. DOI:https://doi.org/10.1145/3068335
SciPy. 2022. Pyton framework for mathematics, science, and engineering. Retrieved 2022 from https://scipy.org/.
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision. 618-626. https://doi.org/10.1109/ICCV.2017.74
Jonathon Shlens. 2014. A tutorial on principal component analysis. arXiv:1404.1100. Retrieved from https://arxiv.org/abs/1404.1100.
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556. Retrieved from https://arxiv.org/abs/1409.1556.
Sandeep Sony, Kyle Dunphy, Ayan Sadhu, and Miriam Capretz. 2021. A systematic review of convolutional neural network-based structural condition assessment techniques. Engineering Structures 226 (2021), 111347.
J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. 2015. Striving for simplicity: The all convolutional net. In Proceedings of the ICLR (workshop track).
Stanford Vision Lab. 2022. ImageNet, image database organized according to the WordNet hierarchy. Retrieved 2022 from https://www.image-net.org.
Alexander Strehl and Joydeep Ghosh. 2002. Cluster ensembles-A knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, Dec (2002), 583-617.
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1-9.
Muhammed Talo. 2019. Automated classification of histopathology images using transfer learning. Artificial Intelligence in Medicine 101 (2019), 101743.
Yuchi Tian. 2021. Detect and Repair Errors for DNN-based Software. Ph.D. Dissertation. Columbia University.
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated testing of deep-neural-networkdriven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering. 303-314.
G. Vallathan, A. John, Chandrasegar Thirumalai, SenthilKumar Mohan, Gautam Srivastava, and Jerry Chun-Wei Lin. 2021. Suspicious activity detection using deep learning in secure assisted living IoT environments. The Journal of Supercomputing 77, 4 (2021), 3242-3260.
Zitong Wan, Rui Yang, Mengjie Huang, Nianyin Zeng, and Xiaohui Liu. 2021. A review on transfer learning in EEG signal analysis. Neurocomputing 421 (2021), 1-14.
Jingyi Wang, Jialuo Chen, Youcheng Sun, Xingjun Ma, Dongxia Wang, Jun Sun, and Peng Cheng. 2021. Robot: Robustness-oriented testing for deep learning systems. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering. IEEE, 300-311.
Martin Wattenberg, Fernanda Viégas, and Ian Johnson. 2016. How to use t-SNE effectively. Distill 1, 10 (2016), e2.
Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in Software Engineering. Springer Science & Business Media. DOI:https://doi.org/10.1007/978-3-642-29044-2
Bowen Xu, Fanghong Guo, Changyun Wen, and Wen-An Zhang. 2021. Detecting false data injection attacks in smart grids with modeling errors: A deep transfer learning based approach. arXiv:2104.06307. Retrieved from https://arxiv.org/abs/2104.06307.
Jian Yang and Jing-yu Yang. 2003. Why can LDA be performed in PCA transformed space? Pattern recognition 36, 2 (2003), 563-566.
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 818-833.
Hao Zhang and W. K. Chan. 2019. Apricot: A weight-Adaptation approach to fixing deep learning models. In Proceedings of the 2019 34th IEEE/ACM International Conference on Automated Software Engineering. IEEE, 376-387.
Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2020. Machine learning testing: Survey, landscapes and horizons. IEEE Transactions on Software Engineering 48, 1 (2020), 1-1. DOI:https://doi.org/10.1109/TSE.2019.2962027
Xiaoyu Zhang, Juan Zhai, Shiqing Ma, and Chao Shen. 2021. AUTOTRAINER: An automatic DNN training problem detection and repair system. In Proceedings of the 2021 IEEE/ACM43rd International Conference on Software Engineering. IEEE, 359-371.
Yu-Dong Zhang, Suresh Chandra Satapathy, David S. Guttery, Juan Manuel Górriz, and Shui-Hua Wang. 2021. Improved breast cancer classification through combining graph convolutional network and convolutional neural network. Information Processing & Management 58, 2 (2021), 102439.
Yue Zhao, Hong Zhu, Kai Chen, and Shengzhi Zhang. 2021. AI-Lancet: Locating error-inducing neurons to optimize neural networks. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 141-158.
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2921-2929. DOI:https://doi.org/10.1109/CVPR.2016.319
Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, and Paolo Tonella. 2021. Deephyperion: Exploring the feature space of deep learning-based systems through illumination search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 79-90.
Similar publications
Sorry the service is unavailable at the moment. Please try again later.