Mutation Testing; Pre-Trained Language Models; CodeBERT
Résumé :
[en] We introduce µBert, a mutation testing tool that uses a pre-trained language model (CodeBERT) to generate mutants. This is done by masking a token from the expression given as input and using CodeBERT to predict it. Thus, the mutants are generated by replacing the masked tokens with the predicted ones. We evaluate µBert on 40 real faults from Defects4J and show that it can detect 27 out of the 40 faults, while the baseline (PiTest) detects 26 of them. We also show that µBert can be 2 times more cost-effective than PiTest, when the same number of mutants are analysed.
Additionally, we evaluate the impact of µBert's mutants when used by program assertion inference techniques, and show that they can help in producing better specifications. Finally, we discuss about the quality and naturalness of some interesting mutants produced by µBert during our experimental evaluation.
Centre de recherche :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Security Design and Validation Research Group (SerVal)
Disciplines :
Sciences informatiques
Auteur, co-auteur :
DEGIOVANNI, Renzo Gaston ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
µBert: Mutation Testing using Pre-Trained Language Models
Date de publication/diffusion :
2022
Nom de la manifestation :
15th {IEEE} International Conference on Software Testing, Verification and Validation Workshops {ICST} Workshops 2022
Date de la manifestation :
April 4-13, 2022
Manifestation à portée :
International
Titre de l'ouvrage principal :
µBert: Mutation Testing using Pre-Trained Language Models
Paul Ammann, Marcio Eduardo Delamaro, and Jeff Offutt. Establishing theoretical minimal sets of mutants. In 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation. IEEE, 2014.
James H. Andrews, Lionel C. Briand, Yvan Labiche, and Akbar Siami Namin. Using mutation analysis for assessing and comparing testing coverage criteria. IEEE Trans. Software Eng., 32(8):608-624, 2006.
Moritz Beller, Chu-Pan Wong, Johannes Bader, Andrew Scott, Mateusz Machalica, Satish Chandra, and Erik Meijer. What it would take to use mutation testing in industry - A study at facebook. In 43rd IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP), pages 268-277. IEEE, 2021.
David Bingham Brown, Michael Vaughn, Ben Liblit, and Thomas W. Reps. The care and feeding of wild-caught mutants. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE, pages 511-522. ACM, 2017.
Thierry Titcheu Chekam, Mike Papadakis, Tegawende F. Bissyande, Yves Le Traon, and Koushik Sen. Selecting fault revealing mutants. Empirical Software Engineering, 25(1):434-487, 2020.
Henry Coles, Thomas Laurent, Christopher Henard, Mike Papadakis, and Anthony Ventresque. PIT: a practical mutation testing tool for java (demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA, pages 449-452. ACM, 2016.
Marcio Eduardo Delamaro, Jose Carlos Maldonado, and Aditya P. Mathur. Interface mutation: An approach for integration testing. IEEE Trans. Software Eng., 27(3):228-247, 2001.
Lin Deng, Jeff Offutt, Paul Ammann, and Nariman Mirzaei. Mutation operators for testing android apps. Inf. Softw. Technol., 81:154-168, 2017.
Alejandra Duque-Torres, Natia Doliashvili, Dietmar Pfahl, and Rudolf Ramler. Predicting survived and killed mutants. In 13th IEEE International Conference on Software Testing, Verification and Validation Workshops, pages 274-283. IEEE, 2020.
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. Codebert: A pre-trained model for programming and natural languages. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP, volume EMNLP 2020 of Findings of ACL, pages 1536-1547. Association for Computational Linguistics, 2020.
Aayush Garg, Milos Ojdanic, Renzo Degiovanni, Thierry Titcheu Chekam, Mike Papadakis, and Yves Le Traon. Cerebro: Static subsuming mutant selection. IEEE Trans. Software Eng.
Rohit Gheyi, Marcio Ribeiro, Beatriz Souza, Marcio Augusto Guimaraes, Leo Fernandes, Marcelo d'Amorim, Vander Alves, Leopoldo Teixeira, and Baldoino Fonseca. Identifying method-level mutation subsumption relations using Z3. Inf. Softw. Technol., 132:106496, 2021.
Rahul Gopinath, Carlos Jensen, and Alex Groce. Mutations: How close are they to real faults? In 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, pages 189-200. IEEE Computer Society, 2014.
Robert M. Hierons and Mercedes G. Merayo. Mutation testing from probabilistic and stochastic finite state machines. J. Syst. Softw., 82(11):1804-1818, 2009.
Reyhaneh Jabbarvand and Sam Malek. μdroid: an energy-aware mutation testing framework for android. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE, pages 208-219. ACM, 2017.
Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. Oasis: oracle assessment and improvement tool. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA, pages 368-371. ACM, 2018.
Matthieu Jimenez, Thierry Titcheu Chekam, Maxime Cordy, Mike Papadakis, Marinos Kintis, Yves Le Traon, and Mark Harman. Are mutants really natural?: a study on how "naturalness" helps mutant selection. In Markku Oivo, Daniel Mendez Fernandez, and Audris Mockus, editors, Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2018, Oulu, Finland, October 11-12, 2018, pages 3:1-3:10. ACM, 2018.
Claudinei Brito Junior, Vinicius H. S. Durelli, Rafael Serapilha Durelli, Simone R. S. Souza, Auri M. R. Vincenzi, and Marcio Eduardo Delamaro. A preliminary investigation into using machine learning algorithms to identify minimal and equivalent mutants. In 13th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICSTW, pages 304-313. IEEE, 2020.
Rene Just, Darioush Jalali, and Michael D. Ernst. Defects4j: A database of existing faults to enable controlled testing studies for java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis, ISSTA 2014, page 437-440, New York, NY, USA, 2014. Association for Computing Machinery.
Ahmed Khanfir, Anil Koyuncu, Mike Papadakis, Maxime Cordy, Tegawende F. Bissyande, Jacques Klein, and Yves Le Traon. Ibir: Bug report driven fault injection, 2020.
Willibald Krenn, Rupert Schlick, Stefan Tiran, Bernhard K. Aichernig, Elisabeth Jobstl, and Harald Brandl. Momut: UML model-based mutation testing for UML. In 8th IEEE International Conference on Software Testing, Verification and Validation, ICST 2015, pages 1-8. IEEE Computer Society, 2015.
Bob Kurtz, Paul Ammann, Jeff Offutt, Marcio Eduardo Delamaro, Mariet Kurtz, and Nida Gokce. Analyzing the validity of selective mutation with dominator mutants. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016, pages 571-582, 2016.
Thomas Loise, Xavier Devroey, Gilles Perrouin, Mike Papadakis, and Patrick Heymans. Towards security-aware mutation testing. In 2017 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST, pages 97-102. IEEE Computer Society, 2017.
Wei Ma, Thierry Titcheu Chekam, Mike Papadakis, and Mark Harman. Mudelta: Delta-oriented mutation testing at commit time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pages 897-909. IEEE, 2021.
Yu-Seung Ma, Yong Rae Kwon, and Jeff Offutt. Inter-class mutation operators for java. In 13th International Symposium on Software Reliability Engineering (ISSRE), pages 352-366. IEEE Computer Society, 2002.
Facundo Molina, Marcelo d'Amorim, and Nazareno Aguirre. Fuzzing class specifications. In Proceedings of the 44th IEEE/ACM International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA. ACM, 2022.
Mike Papadakis, Thierry Titcheu Chekam, and Yves Le Traon. Mutant quality indicators. In 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops, pages 32-39. IEEE Computer Society, 2018.
Mike Papadakis, Christopher Henard, and Yves Le Traon. Sampling program inputs with mutation analysis: Going beyond combinatorial interaction testing. In Seventh IEEE International Conference on Software Testing, Verification and Validation, ICST, pages 1-10. IEEE Computer Society, 2014.
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. Chapter six - mutation testing advances: An analysis and survey. Advances in Computers, 112:275-378, 2019.
Renaud Pawlak, Martin Monperrus, Nicolas Petitprez, Carlos Noguera, and Lionel Seinturier. Spoon: A Library for Implementing Analyses and Transformations of Java Source Code. Software: Practice and Experience, 46:1155-1179, 2015.
Samuel Peacock, Lin Deng, Josh Dehlinger, and Suranjan Chakraborty. Automatic equivalent mutants classification using abstract syntax tree neural networks. In 14th IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST, pages 13-18. IEEE, 2021.
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. Learning how to mutate source code from bug-fixes, 2019.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, page 6000-6010, Red Hook, NY, USA, 2017. Curran Associates Inc.