[en] Applying mutation testing to test subtle program changes, such as program patches or other small-scale code modifications, requires using mutants that capture the delta of the altered behaviours. To address this issue, we introduce the concept of commit-relevant mutants, which are the mutants that interact with the behaviours of the system affected by a particular commit. Therefore, commit-aware mutation testing, is a test assessment metric tailored to a specific commit. By analysing 83 commits from 25 projects involving 2,253,610 mutants in both C and Java, we identify the commit-relevant mutants and explore their relationship with other categories of mutants. Our results show that commit-relevant mutants represent a small subset of all mutants, which differs from the other classes of mutants (subsuming and hard-to-kill), and that the commit-relevant mutation score is weakly correlated with the traditional mutation score (Kendall/Pearson 0.15-0.4). Moreover, commit-aware mutation analysis provides insights about the testing of a commit, which can be more efficient than the classical mutation analysis; in our experiments, by analysing the same number of mutants, commit-aware mutants have better fault-revelation potential (30% higher chances of revealing commit-introducing faults) than traditional mutants. We also illustrate a possible application of commit-aware mutation testing as a metric to evaluate test case prioritisation.
Disciplines :
Sciences informatiques
Auteur, co-auteur :
OJDANIC, Milos ✱; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Ma, Wei ✱
Laurent, Thomas
Titcheu Chekam, Thierry
Ventresque, Anthony
PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
✱ Ces auteurs ont contribué de façon équivalente à la publication.
Ammann P, Delamaro M E, Offutt J (2014) Establishing theoretical minimal sets of mutants. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation. IEEE, pp 21–30
Ammann P, Offutt J (2008) Introduction to software testing. Cambridge University Press. https://doi.org/10.1017/CBO9780511809163
Andrews J H, Briand L C, Labiche Y, Namin A S (2006) Using mutation analysis for assessing and comparing testing coverage criteria. IEEE Trans Softw Eng 32(8):608–624. 10.1109/TSE.2006.83 DOI: 10.1109/TSE.2006.83
Apiwattanapong T, Santelices R A, Chittimalli P K, Orso A, Harrold M J (2006) MATRIX: maintenance-oriented testing requirements identifier and examiner. In: Testing: Academia and Industry Conference - Practice And Research Techniques (TAIC PART 2006), 29-31 August 2006, Windsor, United Kingdom. https://doi.org/10.1109/TAIC-PART.2006.18, pp 137–146
Binkley D W (1997) Semantics guided regression test cost reduction. IEEE Trans Softw Eng 23(8):498–516. 10.1109/32.624306 DOI: 10.1109/32.624306
Binkley D W, Gold N E, Harman M, Islam S S, Krinke J, Yoo S (2015) ORBS and the limits of static slicing. In: 15th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2015, Bremen, Germany, September 27-28, 2015. https://doi.org/10.1109/SCAM.2015.7335396, pp 1–10
Böhme M, d. S. Oliveira B C, Roychoudhury A (2013) Regression tests to expose change interaction errors. In: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, Saint Petersburg, Russian Federation. https://doi.org/10.1145/2491411.2491430, pp 334–344
Böhme M, Roychoudhury A (2014) Corebench: studying complexity of regression errors. In: International Symposium on Software Testing and Analysis, ISSTA ’14, San Jose, CA, USA - July 21 - 26, 2014. https://doi.org/10.1145/2610384.2628058, pp 105–115
Budd T A, Angluin D (1982) Two Notions of Correctness and Their Relation to Testing. Acta Informatica 18(1):31–45 DOI: 10.1007/BF00625279
Cachia M A, Micallef M, Colombo C (2013) Towards incremental mutation testing. Electr Notes Theor Comput Sci 294:2–11. 10.1016/j.entcs.2013.02.012 DOI: 10.1016/j.entcs.2013.02.012
Cadar C, Dunbar D, Engler D (2008) Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, USENIX Association, USA, pp 209–224
Chekam T T, Papadakis M, Bissyandé T F, Traon Y L, Sen K (2020) Selecting fault revealing mutants. Empir Softw Eng 25(1):434–487. 10.1007/s10664-019-09778-7 DOI: 10.1007/s10664-019-09778-7
Chekam T T, Papadakis M, Cordy M, Traon Y L (2020) Killing stubborn mutants with symbolic execution. arXiv: 2001.02941
Chekam T T, Papadakis M, Traon Y L (2019) Mart: a mutant generation tool for LLVM. In: Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019. https://doi.org/10.1145/3338906.3341180, Tallinn, pp 1080–1084
Chekam T T, Papadakis M, Traon Y L (2020) Muteria: An extensible and flexible multi-criteria software testing framework. In: AST ’20: International Conference on Automation of Software Test (AST ’20). https://doi.org/10.1145/3387903.3389316, ACM, New York, p 4
Chekam T T, Papadakis M, Traon Y L, Harman M (2017) An empirical study on mutation, statement and branch coverage fault revelation that avoids the unreliable clean program assumption. In: Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017. https://doi.org/10.1109/ICSE.2017.61, pp 597–608
Coles H, Laurent T, Henard C, Papadakis M, Ventresque A (2016) PIT: a practical mutation testing tool for java (demo). In: Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016. https://doi.org/10.1145/2931037.2948707, pp 449–452
Di Nardo D, Alshahwan N, Briand L, Labiche Y (2015) Coverage-based regression test case selection, minimization and prioritization: A case study on an industrial system. Softw Test Verif Reliab 25(4):371–396 DOI: 10.1002/stvr.1572
Do H, Rothermel G (2006) On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans Softw Eng 32(9):733–752 DOI: 10.1109/TSE.2006.92
Evans R B, Savoia A (2007) Differential testing: a new approach to change detection. In: Proceedings of the 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering. https://doi.org/10.1145/1287624.1287707, pp 549–552
Falleri J-R, Morandat F, Blanc X, Martinez M, Monperrus M (2014) Fine-grained and accurate source code differencing. In: ACM/IEEE international conference on automated software engineering, ASE ’14. https://doi.org/10.1145/2642937.2642982, Vasteras, pp 313–324
Fang C, Chen Z, Xu B (2012) Comparing logic coverage criteria on test case prioritization. Sci China Inf Sci 55(12):2826–2840 DOI: 10.1007/s11432-012-4746-9
Fraser G, Zeller A (2012) Mutation-driven generation of unit tests and oracles. IEEE Trans Softw Eng 38(2):278–292. 10.1109/TSE.2011.93 DOI: 10.1109/TSE.2011.93
Garg A, Ojdanic M, Degiovanni R, Chekam T T, Papadakis M, Traon Y L (2022) Cerebro: Static subsuming mutant selection. IEEE Trans Softw Eng. https://doi.org/10.1109/TSE.2022.3140510
Gheyi R, Ribeiro M, Souza B, Guimarães M A, Fernandes L, d’Amorim M, Alves V, Teixeira L, Fonseca B (2021) Identifying method-level mutation subsumption relations using Z3. Inf Softw Technol 132:106496. 10.1016/j.infsof.2020.106496 DOI: 10.1016/j.infsof.2020.106496
Hariri F, Shi A, Fernando V, Mahmood S, Marinov D (2019) Comparing mutation testing at the levels of source code and compiler intermediate representation. In: 12th IEEE Conference on Software Testing, Validation and Verification, ICST 2019, Xi’an, China, April 22-27, 2019. https://doi.org/10.1109/ICST.2019.00021, pp 114–124
Henard C, Papadakis M, Harman M, Jia Y, Le Traon Y (2016) Comparing white-box and black-box test prioritization. In: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pp 523–534
Jia Y, Harman M (2009) Higher order mutation testing. Inf Softw Technol 51(10):1379–1393. 10.1016/j.infsof.2009.04.016 DOI: 10.1016/j.infsof.2009.04.016
Khatibsyarbini M, Isa M A, Jawawi DNA, Tumeng R (2018) Test case prioritization approaches in regression testing: A systematic literature review. Inf Softw Technol 93:74–93 DOI: 10.1016/j.infsof.2017.08.014
Kintis M, Papadakis M, Jia Y, Malevris N, Traon Y L, Harman M (2018) Detecting trivial mutant equivalences via compiler optimisations. IEEE Trans Softw Eng 44(4):308–333. 10.1109/TSE.2017.2684805 DOI: 10.1109/TSE.2017.2684805
Kintis M, Papadakis M, Malevris N (2010) Evaluating mutation testing alternatives: A collateral experiment. In: Han J, Thu T D (eds) 17th Asia Pacific Software Engineering Conference, APSEC 2010. https://doi.org/10.1109/APSEC.2010.42. IEEE Computer Society, Sydney, pp 300–309
Kuchta T, Palikareva H, Cadar C (2018) Shadow symbolic execution for testing software patches. ACM Trans Softw Eng Methodol 27(3):10:1–10:32. 10.1145/3208952 DOI: 10.1145/3208952
Kurtz B, Ammann P, Delamaro M E, Offutt J, Deng L (2014) Mutant subsumption graphs. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation Workshops. IEEE, pp 176–185
Kurtz B, Ammann P, Offutt J, Delamaro M E, Kurtz M, Gökçe N (2016) Analyzing the validity of selective mutation with dominator mutants. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016. https://doi.org/10.1145/2950290.2950322, Seattle, pp 571–582
Lam W, Godefroid P, Nath S, Santhiar A, Thummalapenta S (2019) Root causing flaky tests in a large-scale industrial setting. ISSTA 2019. https://doi.org/10.1145/3293882.3330570. Association for Computing Machinery, New York, pp 101–111
Laurent T, Papadakis M, Kintis M, Henard C, Le Traon Y, Ventresque A (2017) Assessing and improving the mutation testing practice of pit. In: 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE, pp 430–435
Leong C, Singh A, Papadakis M, Traon Y L, Micco J (2019) Assessing transition-based test selection algorithms at google. In: Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2019, Montreal, QC, Canada, May 25-31, 2019. https://doi.org/10.1109/ICSE-SEIP.2019.00019, pp 101–110
Li L, Zhou Y, Yu Y, Zhao F, Wu S, Yang Z (2019) An empirical study of mutation-based test case clustering prioritization and reduction technique. ICSEA 2019:13
Li N, Praphamontripong U, Offutt J (2009) An experimental comparison of four unit test criteria: Mutation, edge-pair, all-uses and prime path coverage. In: 2009 International Conference on Software Testing, Verification, and Validation Workshops. IEEE, pp 220–229
Lou Y, Hao D, Zhang L (2015) Mutation-based test-case prioritization in software evolution. In: 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE), pp 46–57
Luo Q, Moran K, Poshyvanyk D, Di Penta M (2018) Assessing test case prioritization on real faults and mutants. In: 2018 IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 240–251
Ma W, Laurent T, Ojdanić M, Chekam T T, Ventresque A, Papadakis M (2020) Commit-aware mutation testing. In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, pp 394–405
Mao D, Chen L, Zhang L (2019) An extensive study on cross-project predictive mutation testing. In: 12th IEEE Conference on Software Testing, Validation and Verification, ICST 2019. https://doi.org/10.1109/ICST.2019.00025, Xi’an, pp 160–171
Marcozzi M, Bardin S, Kosmatov N, Papadakis M, Prevosto V, Correnson L (2018) Time to clean your test objectives. In: Chaudron M, Crnkovic I, Chechik M, Harman M (eds) Proceedings of the 40th International Conference on Software Engineering, ICSE 2018. https://doi.org/10.1145/3180155.3180191. ACM, Gothenburg, pp 456–467
Marinescu P D, Cadar C (2013) KATCH: high-coverage testing of software patches. In: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13. https://doi.org/10.1145/2491411.2491438, Saint Petersburg, pp 235–245
Offutt A J, Rothermel G, Zapf C (1993) An experimental evaluation of selective mutation. In: Proceedings of the 15th International Conference on Software Engineering. http://portal.acm.org/citation.cfm?id=257572.257597, Baltimore, pp 100–107
Papadakis M, Chekam T T, Traon Y L (2018) Mutant quality indicators. In: 2018 IEEE International Conference on Software Testing, Verification and Validation Workshops, ICST Workshops. https://doi.org/10.1109/ICSTW.2018.00025. IEEE Computer Society, Västerås, pp 32–39
Papadakis M, Henard C, Harman M, Jia Y, Traon Y L (2016) Threats to the validity of mutation-based test assessment. In: Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016. https://doi.org/10.1145/2931037.2931040, Saarbrücken, pp 354–365
Papadakis M, Jia Y, Harman M, Traon Y L (2015) Trivial compiler equivalence: A large scale empirical study of a simple, fast and effective equivalent mutant detection technique. In: Bertolino A, Canfora G, Elbaum S G (eds) 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015. https://doi.org/10.1109/ICSE.2015.103, vol 1. IEEE Computer Society, Florence, pp 936–946
Papadakis M, Kintis M, Zhang J, Jia Y, Traon Y L, Harman M (2019) Chapter six - mutation testing advances: An analysis and survey. Adv Comput 112:275–378. 10.1016/bs.adcom.2018.03.015 DOI: 10.1016/bs.adcom.2018.03.015
Papadakis M, Shin D, Yoo S, Bae D-H (2018) Are mutation scores correlated with real fault detection?: a large scale empirical study on the relationship between mutants and real faults. In: Proceedings of the 40th International Conference on Software Engineering, ICSE 2018.https://doi.org/10.1145/3180155.3180183, Gothenburg, pp 537–548
Person S, Dwyer M B, Elbaum S G, Pasareanu C S (2008) Differential symbolic execution. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. https://doi.org/10.1145/1453101.1453131, Atlanta, pp 226–237
Petrovic G, Ivankovic M (2018) State of mutation testing at google. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2018. https://doi.org/10.1145/3183519.3183521, Gothenburg, pp 163–171
Qi D, Roychoudhury A, Liang Z (2010) Test generation to expose changes in evolving programs. In: ASE 2010, 25th IEEE/ACM International Conference on Automated Software Engineering. https://doi.org/10.1145/1858996.1859083, Antwerp, pp 397–406
Rothermel G, Harrold M J (1994) Selecting tests and identifying test coverage requirements for modified software. In: Proceedings of the 1994 International Symposium on Software Testing and Analysis, ISSTA 1994. https://doi.org/10.1145/186258.187171, Seattle, pp 169–184
Rothermel G, Untch R H, Chu C, Harrold M J (1999) Test case prioritization: An empirical study. In: Proceedings IEEE International Conference on Software Maintenance-1999 (ICSM’99).’ Software Maintenance for Business Change’(Cat. No. 99CB36360). IEEE, pp 179–188
Santelices R A, Chittimalli P K, Apiwattanapong T, Orso A, Harrold M J (2008) Test-suite augmentation for evolving software. In: 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE 2008). https://doi.org/10.1109/ASE.2008.32, L’Aquila, pp 218–227
Santelices R A, Harrold M J (2010) Exploiting program dependencies for scalable multiple-path symbolic execution. In: Proceedings of the Nineteenth International Symposium on Software Testing and Analysis, ISSTA 2010. https://doi.org/10.1145/1831708.1831733, Trento, pp 195–206
Santelices R A, Harrold M J (2011) Applying aggressive propagation-based strategies for testing changes. In: Fourth IEEE International Conference on Software Testing, Verification and Validation, ICST 2011. https://doi.org/10.1109/ICST.2011.46, Berlin, pp 11–20
Shin D, Yoo S, Papadakis M, Bae D-H (2019) Empirical evaluation of mutation-based test case prioritization techniques. Softw Test Verif Reliab 29(1-2):e1695 DOI: 10.1002/stvr.1695
Smith B H, Williams L (2009) On guiding the augmentation of an automated test suite via mutation analysis. Empir Softw Eng 14 (3):341–369. 10.1007/s10664-008-9083-7 DOI: 10.1007/s10664-008-9083-7
Smith B H, Williams L (2009) Should software testers use mutation analysis to augment a test set?. J Syst Softw 82(11):1819–1832. 10.1016/j.jss.2009.06.031 DOI: 10.1016/j.jss.2009.06.031
Xu Z, Kim Y, Kim M, Rothermel G, Cohen M B (2010) Directed test suite augmentation: techniques and tradeoffs. In: Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering. https://doi.org/10.1145/1882291.1882330, Santa Fe, pp 257–266
Yoo S, Harman M (2012) Regression testing minimization, selection and prioritization: a survey. Softw Test Verif Reliab 22(2):67–120. 10.1002/stv.430 DOI: 10.1002/stv.430
Zhang J, Zhang L, Harman M, Hao D, Jia Y, Zhang L (2019) Predictive mutation testing. IEEE Trans Softw Eng 45(9):898–918. 10.1109/TSE.2018.2809496 DOI: 10.1109/TSE.2018.2809496
Zhang L, Hao D, Zhang L, Rothermel G, Mei H (2013) Bridging the gap between the total and additional test-case prioritization strategies. In: 2013 35th International Conference on Software Engineering (ICSE). IEEE, pp 192–201
Zhang L, Marinov D, Zhang L, Khurshid S (2012) Regression mutation testing. In: International Symposium on Software Testing and Analysis, ISSTA 2012, Minneapolis, pp 331–341