[en] Context: When software evolves, opportunities for introducing faults appear. Therefore, it is important to test the evolved program behaviors during each evolution cycle. However, while software evolves, its complexity is also evolving, introducing challenges to the testing process. To deal with this issue, testing techniques should be adapted to target the effect of the program changes instead of the entire program functionality. To this end, commit-aware mutation testing, a powerful testing technique, has been proposed. Unfortunately, commit-aware mutation testing is challenging due to the complex program semantics involved. Hence, it is pertinent to understand the characteristics, predictability, and potential of the technique.
Objective: We conduct an exploratory study to investigate the properties of commit-relevant mutants, i.e., the test elements of commit-aware mutation testing, by proposing a general definition and an experimental approach to identify them. We thus, aim at investigating the prevalence, location, and comparative advantages of commit-aware mutation testing over time (i.e., the program evolution). We also investigate the predictive power of several commit-related features in identifying and selecting commit-relevant mutants to understand the essential properties for its best-effort application case.
Method: Our commit-relevant definition relies on the notion of observational slicing, approximated by higher-order mutation. Specifically, our approach utilizes the impact of mutants, effects of one mutant on another in capturing and analyzing the implicit interactions between the changed and unchanged code parts. The study analyses millions of mutants (over 10 million), 288 commits, five (5) different open-source software projects involving over 68,213 CPU days of computation and sets a ground truth where we perform our analysis.
Results: Our analysis shows that commit-relevant mutants are located mainly outside of program commit change (81%), suggesting a limitation in previous work. We also note that effective selection of commit-relevant mutants has the potential of reducing the number of mutants by up to 93%. In addition, we demonstrate that commit relevant mutation testing is significantly more effective and efficient than state-of-the-art baselines, i.e., random mutant selection and analysis of only mutants within the program change. In our analysis of the predictive power of mutants and commit-related features (e.g., number of mutants within a change, mutant type, and commit size) in predicting commit-relevant mutants, we found that most proxy features do not reliably predict commit-relevant mutants.
Conclusion: This empirical study highlights the properties of commit-relevant mutants and demonstrates the importance of identifying and selecting commit-relevant mutants when testing evolving software systems.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
OJDANIC, Milos ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
SOREMEKUN, Ezekiel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
DEGIOVANNI, Renzo Gaston ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Mutation Testing in Evolving Systems: Studying the relevance of mutants to code evolution
Date de publication/diffusion :
11 mai 2022
Titre du périodique :
ACM Transactions on Software Engineering and Methodology
ISSN :
1049-331X
Maison d'édition :
Association for Computing Machinery (ACM), Etats-Unis
Java Platform. [n.d.]. Standard Edition, API Specification. https://docs.oracle.com/javase/7/docs/api/java/io/Reader. html. [Online; accessed 10-December-2021].
Mohammad Amin Alipour, August Shi, Rahul Gopinath, Darko Marinov, and Alex Groce. 2016. Evaluating nonadequate test-case reduction. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE'16). IEEE, 16-26.
Paul Ammann, Marcio Eduardo Delamaro, and Jeff Offutt. 2014. Establishing theoretical minimal sets of mutants. In 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation. IEEE.
Taweesup Apiwattanapong, Raúl A. Santelices, Pavan Kumar Chittimalli, Alessandro Orso, and Mary Jean Harrold. 2006. MATRIX:Maintenance-oriented testing requirements identifier and examiner. In Testing: Academia and Industry Conference-Practice And Research Techniques (TAIC PART'06), 29-31 August 2006, Windsor, United Kingdom. 137-146. https://doi.org/10.1109/TAIC-PART.2006.18
David W. Binkley. 1997. Semantics guided regression test cost reduction. IEEE Trans. Software Eng. 23, 8 (1997), 498-516. https://doi.org/10.1109/32.624306
David W. Binkley, Nicolas Gold, Mark Harman, Syed S. Islam, Jens Krinke, and Shin Yoo. 2014. ORBS: Languageindependent program slicing. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16-22, 2014, Shing-Chi Cheung, Alessandro Orso, and Margaret-Anne D. Storey (Eds.). ACM, 109-120. https://doi.org/10.1145/2635868.2635893
DavidW. Binkley and Mark Harman. 2005. Locating dependence clusters and dependence pollution. In 21st IEEE International Conference on Software Maintenance (ICSM'05), 25-30 September 2005, Budapest, Hungary. IEEE Computer Society, 177-186. https://doi.org/10.1109/ICSM.2005.58
DavidW. Binkley, Mark Harman, and Jens Krinke. 2007. Empirical study of optimization techniques for massive slicing. ACM Trans. Program. Lang. Syst. 30, 1 (2007), 3. https://doi.org/10.1145/1290520.1290523
Mark Anthony Cachia, Mark Micallef, and Christian Colombo. 2013. Towards incremental mutation testing. Electronic Notes in Theoretical Computer Science 294 (2013), 2-11. Proceedings of the 2013 Validation Strategies for Software Evolution (VSSE'13) Workshop. https://doi.org/10.1016/j.entcs.2013.02.012
Thierry Titcheu Chekam, Mike Papadakis, Tegawendé F. Bissyandé, Yves Le Traon, and Koushik Sen. 2020. Selecting fault revealing mutants. Empirical Software Engineering 25, 1 (2020), 434-487. https://doi.org/10.1007/s10664-019-09778-7
Henry Coles, Thomas Laurent, Christopher Henard, Mike Papadakis, and Anthony Ventresque. 2016. PIT: A practical mutation testing tool for Java (Demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA'16). Association for Computing Machinery, New York, NY, USA, 449-452. https://doi.org/10.1145/ 2931037.2948707
Marcio Eduardo Delamaro, J. C. Maidonado, and Aditya P. Mathur. 2001. Interface mutation: An approach for integration testing. IEEE Transactions on Software Engineering 27, 3 (2001), 228-247.
Richard Demillo, R. J. Lipton, and F. G. Sayward. 1978. Hints on test data selection: Help for the practicing programmer. Computer 11 (05 1978), 34-41. https://doi.org/10.1109/C-M.1978.218136
Martin Fowler. [n.d.] Continuous Integration. https://martinfowler.com/articles/continuousIntegration.html. Online; accessed 31 March 2021.
Gordon Fraser and Andreas Zeller. 2012. Mutation-driven generation of unit tests and oracles. IEEE Trans. Software Eng. 38, 2 (2012), 278-292. https://doi.org/10.1109/TSE.2011.93
Marcio Augusto Guimarães, Leo Fernandes, Márcio Ribeiro, Marcelo d'Amorim, and Rohit Gheyi. 2020. Optimizing mutation testing by discovering dynamic mutant subsumption relations. In 2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST'20). IEEE, 198-208.
Yue Jia and Mark Harman. 2009. Higher order mutation testing. Information and Software Technology 51, 10 (2009). Source Code Analysis and Manipulation (SCAM'08). 1379-1393. https://doi.org/10.1016/j.infsof.2009.04.016.
Gene Kim, Patrick Debois, John Willis, and Jez Humble. 2016. The DevOps Handbook: How to CreateWorld-Class Agility, Reliability, and Security in Technology Organizations. IT Revolution Press.
Marinos Kintis, Mike Papadakis, Yue Jia, Nicos Malevris, Yves Le Traon, and Mark Harman. 2018. Detecting trivial mutant equivalences via compiler optimisations. IEEE Trans. Software Eng. 44, 4 (2018), 308-333. https://doi.org/10. 1109/TSE.2017.2684805
Marinos Kintis, Mike Papadakis, and Nicos Malevris. 2010. Evaluating mutation testing alternatives: A collateral experiment. In 17th Asia Pacific Software Engineering Conference, (APSEC'10), Sydney, Australia, November 30-December 3, 2010, Jun Han and Tran Dan Thu (Eds.). IEEE Computer Society, 300-309. https://doi.org/10.1109/APSEC.2010.42
Marinos Kintis, Mike Papadakis, and Nicos Malevris. 2012. Isolating first order equivalent mutants via second order mutation. In Fifth IEEE International Conference on Software Testing, Verification and Validation, (ICST'12), Montreal, QC, Canada, April 17-21, 2012, Giuliano Antoniol, Antonia Bertolino, and Yvan Labiche (Eds.). IEEE Computer Society, 701-710. https://doi.org/10.1109/ICST.2012.160
Marinos Kintis, Mike Papadakis, and Nicos Malevris. 2015. Employing second-order mutation for isolating first-order equivalent mutants. Softw. Test. Verification Reliab. 25, 5-7 (2015), 508-535. https://doi.org/10.1002/stvr.1529
Marinos Kintis, Mike Papadakis, Andreas Papadopoulos, Evangelos Valvis, Nicos Malevris, and Yves Le Traon. 2018. How effective are mutation testing tools? An empirical analysis of Java mutation testing tools with manual analysis and real faults. Empir. Softw. Eng. 23, 4 (2018), 2426-2463. https://doi.org/10.1007/s10664-017-9582-5
Tomasz Kuchta, Hristina Palikareva, and Cristian Cadar. 2018. Shadow symbolic execution for testing software patches. ACM Trans. Softw. Eng. Methodol. 27, 3 (2018), 10:1-10:32. https://doi.org/10.1145/3208952
Bob Kurtz, Paul Ammann, Jeff Offutt, Márcio Eduardo Delamaro, Mariet Kurtz, and Nida Gökçe. 2016. Analyzing the validity of selective mutation with dominator mutants. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE'16), Seattle,WA, USA, November 13-18, 2016. 571-582. https: //doi.org/10.1145/2950290.2950322
Thomas Laurent, Mike Papadakis, Marinos Kintis, Christopher Henard, Yves Le Traon, and Anthony Ventresque. 2017. Assessing and improving the mutation testing practice of PIT. In 2017 IEEE International Conference on Software Testing, Verification and Validation, (ICST'17), Tokyo, Japan, March 13-17, 2017. IEEE Computer Society, 430-435. https://doi.org/10.1109/ICST.2017.47
T. Laurent and A. Ventresque. 2019. PIT-HOM: An extension of Pitest for higher order mutation analysis. In 2019 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW'19). 83-89. https: //doi.org/10.1109/ICSTW.2019.00036
Wei Ma, Thierry Titcheu Chekam, Mike Papadakis, and Mark Harman. 2021. MuDelta: Delta-oriented mutation testing at commit time. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE'21). IEEE, 897-909.
Wei Ma, Thomas Laurent, Miloŝ Ojdani, Thierry Titcheu Chekam, Anthony Ventresque, and Mike Papadakis. 2020. Commit-aware mutation testing. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME'20). IEEE, 394-405.
Dongyu Mao, Lingchao Chen, and Lingming Zhang. 2019. An extensive study on cross-project predictive mutation testing. In 12th IEEE Conference on Software Testing, Validation and Verification, (ICST'19), Xi'an, China, April 22-27, 2019. 160-171. https://doi.org/10.1109/ICST.2019.00025
Paul Dan Marinescu and Cristian Cadar. 2013. KATCH: High-coverage testing of software patches. In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE'13), Saint Petersburg, Russian Federation,August 18-26, 2013. 235-245. https://doi.org/10.1145/ 2491411.2491438
Leann Myers and Maria J. Sirois. 2004. Spearman correlation coefficients, differences between. Encyclopedia of Statistical Sciences 12 (2004).
A. Jefferson Offutt. 1992. Investigations of the software testing coupling effect. ACM Trans. Softw. Eng. Methodol. 1, 1 (Jan. 1992), 5-20. https://doi.org/10.1145/125489.125473
A. Jefferson Offutt, Gregg Rothermel, and Christian Zapf. 1993. An experimental evaluation of selective mutation. In Proceedings of the 15th International Conference on Software Engineering (ICSE'93). IEEE Computer Society Press, Washington, DC, USA, 100-107.
Mike Papadakis, Thierry Titcheu Chekam, and Yves Le Traon. 2018. Mutant quality indicators. In 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST Workshops, Västerås, Sweden, April 9-13, 2018. IEEE Computer Society, 32-39. https://doi.org/10.1109/ICSTW.2018.00025
Mike Papadakis, Christopher Henard, Mark Harman, Yue Jia, and Yves Le Traon. 2016. Threats to the validity of mutation-based test assessment. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA'16), Saarbrücken, Germany, July 18-20, 2016. 354-365. https://doi.org/10.1145/2931037.2931040
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark Harman. 2019. Chapter six-mutation testing advances: An analysis and survey. Advances in Computers 112 (2019), 275-378. https://doi.org/10.1016/bs. adcom.2018.03.015
Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, and Corina S. Pasareanu. 2008. Differential symbolic execution. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2008, Atlanta, Georgia, USA, November 9-14, 2008. 226-237. https://doi.org/10.1145/1453101.1453131
Goran Petrovi and Marko Ivankovi. 2018. State of mutation testing at Google. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP'18). Association for Computing Machinery, New York, NY, USA, 163-171. https://doi.org/10.1145/3183519.3183521
Dawei Qi, Abhik Roychoudhury, and Zhenkai Liang. 2010. Test generation to expose changes in evolving programs. In ASE 2010, 25th IEEE/ACM International Conference on Automated Software Engineering, Antwerp, Belgium, September 20-24, 2010. 397-406. https://doi.org/10.1145/1858996.1859083
Gregg Rothermel and Mary Jean Harrold. 1994. Selecting tests and identifying test coverage requirements for modified software. In Proceedings of the 1994 International Symposium on Software Testing and Analysis, (ISSTA'94), Seattle,WA, USA, August 17-19, 1994. 169-184. https://doi.org/10.1145/186258.187171
Raúl A. Santelices, Pavan Kumar Chittimalli, Taweesup Apiwattanapong, Alessandro Orso, and Mary Jean Harrold. 2008. Test-suite augmentation for evolving software. In 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE'08), 15-19 September 2008, L'Aquila, Italy. 218-227. https://doi.org/10.1109/ASE.2008.32
Raúl A. Santelices and Mary Jean Harrold. 2010. Exploiting program dependencies for scalable multiple-path symbolic execution. In Proceedings of the Nineteenth International Symposium on Software Testing and Analysis, (ISSTA'10), Trento, Italy, July 12-16, 2010. 195-206. https://doi.org/10.1145/1831708.1831733
Raúl A. Santelices and Mary Jean Harrold. 2011. Applying aggressive propagation-based strategies for testing changes. In Fourth IEEE International Conference on Software Testing, Verification and Validation, (ICST'11), Berlin, Germany, March 21-25, 2011. 11-20. https://doi.org/10.1109/ICST.2011.46
BenH. Smith and LaurieWilliams. 2009. On guiding the augmentation of an automated test suite viamutation analysis. Empirical Software Engineering 14, 3 (2009), 341-369. https://doi.org/10.1007/s10664-008-9083-7
Ben H. Smith and LaurieWilliams. 2009. Should software testers use mutation analysis to augment a test set? Journal of Systems and Software 82, 11 (2009), 1819-1832. https://doi.org/10.1016/j.jss.2009.06.031
Bo Wang, Sirui Lu, Yingfei Xiong, and Feng Liu. 2021. Faster mutation analysis with fewer processes and smaller overheads. In Proceedings of the IEEE/ACM Automated Software Engineering (ASE'21) Conference.
Zhihong Xu, Yunho Kim, Moonzoo Kim, Gregg Rothermel, and Myra B. Cohen. 2010. Directed test suite augmentation: Techniques and tradeoffs. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2010, Santa Fe, NM, USA, November 7-11, 2010. 257-266. https://doi.org/10.1145/1882291.1882330
Shin Yoo and Mark Harman. 2012. Regression testing minimization, selection and prioritization: A survey. Softw. Test., Verif. Reliab. 22, 2 (2012), 67-120. https://doi.org/10.1002/stv.430
Jie Zhang, Lingming Zhang, Mark Harman, Dan Hao, Yue Jia, and Lu Zhang. 2019. Predictive mutation testing. IEEE Trans. Software Eng. 45, 9 (2019), 898-918. https://doi.org/10.1109/TSE.2018.2809496
Lingming Zhang, Darko Marinov, Lu Zhang, and Sarfraz Khurshid. 2012. Regression mutation testing. In International Symposium on Software Testing and Analysis, (ISSTA'12), Minneapolis, MN, USA, July 15-20, 2012, Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 331-341. https://doi.org/10.1145/2338965.2336793