[en] Test flakiness forms a major testing concern. Flaky tests manifest non-deterministic outcomes that cripple continuous integration and lead developers to investigate false alerts. Industrial reports indicate that on a large scale, the accrual of flaky tests breaks the trust in test suites and entails significant computational cost. To alleviate this, practitioners are constrained to identify flaky tests and investigate their impact. To shed light on such mitigation mechanisms, we interview 14 practitioners with the aim to identify (i) the sources of flakiness within the testing ecosystem, (ii) the impacts of flakiness, (iii) the measures adopted by practitioners when addressing flakiness, and (iv) the automation opportunities for these measures. Our analysis shows that, besides the tests and code, flakiness stems from interactions between the system components, the testing infrastructure, and external factors. We also highlight the impact of flakiness on testing practices and product quality and show that the adoption of guidelines together with a stable infrastructure are key measures in mitigating the problem.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SerVal - Security, Reasoning & Validation
Disciplines :
Computer science
Author, co-author :
HABCHI, Sarra ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
HABEN, Guillaume ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)
CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
External co-authors :
no
Language :
English
Title :
A Qualitative Study on the Sources, Impacts, and Mitigation Strategies of Flaky Tests
Publication date :
April 2022
Event name :
15th International Conference on Software Testing, Verification and Validation
Event date :
from 04-04-2022 to 08-04-2022
Main work title :
A Qualitative Study on the Sources, Impacts, and Mitigation Strategies of Flaky Tests
J. Thomas, "Welcome to the google engineering tools blog!-google engineering tools," http://google-engtools.blogspot.com/2011/05/welcome-to-google-engineering-tools.html, May 2011, (Accessed on 02/22/2021).
J. Listfield, "Google testing blog: Where do our flaky tests come from?" https://testing.googleblog.com/2017/04/where-do-our-flaky-tests-come-from.html, April 2017, (Accessed on 01/12/2021).
M. contributors, "Test verification-mozilla-mdn," https://developer.mozilla.org/en-US/docs/Mozilla/QA/Test Verification, March 2019, (Accessed on 01/12/2021).
J. Palmer, "Test flakiness-methods for identifying and dealing with flaky tests: Spotify engineering," https://engineering.atspotify.com/2019/11/18/test-flakiness-methods-for-identifying-and-dealing-with-flaky-tests/, November 2019, (Accessed on 01/12/2021).
A. Micco, John & Memon, "Gtac 2016: How flaky tests in continuous integration-youtube," https://www.youtube.com/watch?v=CrzpkF1-VsA, December 2016, (Accessed on 01/12/2021).
Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, "An empirical analysis of flaky tests," in Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2014. Hong Kong, China: Association for Computing Machinery, Nov. 2014, pp. 643-653. [Online]. Available: https://doi.org/10.1145/2635868.2635920
M. Eck, F. Palomba, M. Castelluccio, and A. Bacchelli, "Understanding flaky tests: the developer's perspective," in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2019. Tallinn, Estonia: Association for Computing Machinery, Aug. 2019, pp. 830-840. [Online]. Available: https://doi.org/10.1145/3338906.3338945
S. Thorve, C. Sreshtha, and N. Meng, "An empirical study of flaky tests in android apps," Proceedings-2018 IEEE International Conference on Software Maintenance and Evolution, ICSME 2018, pp. 534-538, 2018.
S. Dutta, A. Shi, R. Choudhary, Z. Zhang, A. Jain, and S. Misailovic, "Detecting flaky tests in probabilistic and machine learning applications," in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. Virtual Event USA: ACM, Jul. 2020, pp. 211-224. [Online]. Available: https://dl.acm.org/doi/10.1145/3395363.3397366
W. Lam, P. Godefroid, S. Nath, A. Santhiar, and S. Thummalapenta, "Root Causing Flaky Tests in a Large-Scale Industrial Setting," in Proceedings ofthe 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA '19). Beijing, China: ACM Press, 2019, pp. 101-111.
J. Bell, O. Legunsen, M. Hilton, L. Eloussi, T. Yung, and D. Marinov, "DeFlaker: Automatically Detecting Flaky Tests," in 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), May 2018, pp. 433-444, iSSN: 1558-1225.
W. Lam, R. Oei, A. Shi, D. Marinov, and T. Xie, "iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests," in 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), Apr. 2019, pp. 312-322, iSSN: 2159-4848.
A. Alshammari, C. Morris, M. Hilton, and J. Bell, "Flakeflagger: Predicting flakiness without rerunning tests," in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021, pp. 1572-1584.
A. Shi, W. Lam, R. Oei, T. Xie, and D. Marinov, "iFixFlakies: a framework for automatically fixing order-dependent flaky tests," in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2019. Tallinn, Estonia: Association for Computing Machinery, Aug. 2019, pp. 545-555. [Online]. Available: https://doi.org/10.1145/3338906.3338925
G. Pinto, B. Miranda, S. Dissanayake, M. D'Amorim, C. Treude, and A. Bertolino, "What is the Vocabulary of Flaky Tests?" Proceedings-2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020, pp. 492-502, 2020.
G. Haben, S. Habchi, M. Papadakis, M. Cordy, and Y. Le Traon, "A replication study on the usability of code vocabulary in predicting flaky tests," in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), 2021, pp. 219-229.
M. Cordy, R. Rwemalika, M. Papadakis, and M. Harman, "Flakime: Laboratory-controlled test flakiness impact assessment. A case study on mutation testing and program repair," CoRR, vol. abs/1912.03197, 2019. [Online]. Available: http://arxiv.org/abs/1912.03197
W. Lam and K. Muslu, "A study on the lifecycle of flaky tests," P. 12, 2020.
C. Leong, A. Singh, M. Papadakis, Y. L. Traon, and J. Micco, "Assessing transition-based test selection algorithms at google," in Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE (SEIP) 2019, Montreal, QC, Canada, May 25-31, 2019, H. Sharp and M. Whalen, Eds. IEEE/ACM, 2019, pp. 101-110. [Online]. Available: https://doi.org/10.1109/ICSE-SEIP.2019.00019
E. Kowalczyk, K. Nair, Z. Gao, L. Silberstein, T. Long, and A. Memon, "Modeling and ranking flaky tests at apple," Proceedings-International Conference on Software Engineering, pp. 110-119, 2020.
B. Kitchenham and S. Charters, "Guidelines for performing systematic literature reviews in software engineering," Keele University and Durham University Joint Report, Tech. Rep. EBSE 2007-001, 2007. [Online]. Available: http://www.dur.ac.uk/ebse/resources/Systematic-reviews-5-8.pdf
V. Garousi, M. Felderer, and M. V. Mäntylä, "Guidelines for including grey literature and conducting multivocal literature reviews in software engineering," Information and Software Technology, vol. 106, pp. 101-121, 2019.
Authors, "Summary of the qualitative results," https://figshare.com/s/5b252c442fc36e8823cb, February 2021, (Accessed on 02/24/2021).
J. W. Creswell and J. D. Creswell, Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications, 2017.
S. Adolph, W. Hall, and P. Kruchten, "Using grounded theory to study the experience of software development," Empirical Software Engineering, vol. 16, no. 4, pp. 487-513, 2011.
S. E. Hove and B. Anda, "Experiences from conducting semi-structured interviews in empirical software engineering research," in Software metrics, 2005. 11th ieee international symposium. IEEE, 2005, pp. 10-pp.
K. F. Tómasdóttir, M. Aniche, and A. V. Deursen, "Why and how javascript developers use linters," in Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 2017, pp. 578-589.
S. Habchi, X. Blanc, and R. Rouvoy, "On adopting linters to deal with performance concerns in android apps," in 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2018, pp. 6-16.
B. G. Glaser and J. Holton, "Remodeling grounded theory," Historical Social Research/Historische Sozialforschung. Supplement, pp. 47-68, 2007.
C. Schmidt, "The analysis of semi-structured interviews," A companion to qualitative research, pp. 253-258, 2004.
D. G. Oliver, J. M. Serovich, and T. L. Mason, "Constraints and opportunities with interview transcription: Towards reflection in qualitative research," Social forces, vol. 84, no. 2, pp. 1273-1289, 2005.
K. Hu, "Test stability-how we make ui tests stable-linkedin engineering," https://engineering.linkedin.com/blog/2015/12/test-stability-how-we-make-ui-tests-stable, December 2015, (Accessed on 02/24/2021).
A. Solntsev, "Flaky tests 2-jvm advent," https://www.javaadvent.com/2017/12/flaky-tests-2.html, December 2017, (Accessed on 02/24/2021).
E. Developer, "How to fix flaky tests in your ios/swift codebases-youtube," https://www.youtube.com/watch?v= BOp6WYbq38& ab channel=EssentialDeveloper, December 2019, (Accessed on 02/24/2021).
J. Vimberg, "Effective testing-reducing non-determinism to avoid flaky tests-coding forest," https://jivimberg.io/blog/2020/07/27/effective-testing-reducing-non-determinism/, July 2020, (Accessed on 02/24/2021).
Testinium, "Flaky tests and how to reduce them-testinium," https://testinium.com/blog/flaky-tests-and-how-to-reduce-them/, (Accessed on 02/24/2021).
Smartbear, "Managing test flakiness-testcomplete," https://smartbear.com/resources/ebooks/managing-ui-test-flakiness/, June 2018, (Accessed on 02/24/2021).
P. Fabio, "Introducing the software testing cupcake (anti-pattern)-thoughtworks," https://www.thoughtworks.com/insights/blog/introducing-software-testing-cupcake-anti-pattern, June 2014, (Accessed on 02/24/2021).
S. Pavan, "No more flaky tests on the go team-thoughtworks," https://www.thoughtworks.com/insights/blog/no-more-flaky-tests-go-team, September 2012, (Accessed on 02/24/2021).
D. Welter, "Preventing flaky tests from ruining your test suite-gradle enterprise," https://gradle.com/blog/prevent-flaky-tests/, (Accessed on 02/24/2021).
T. C. Gang, "Flaky tests-a war that never ends-hacker noon," https://hackernoon.com/flaky-tests-a-war-that-never-ends-9aa32fdef359, December 2017, (Accessed on 02/24/2021).
Z. Attas, "Selenium conf 2018-how to un-flake flaky tests-a new hire's toolkit-confengine-conference platform," https://confengine.com/conferences/selenium-conf-2018/proposal/6157/how-to-un-flake-flaky-tests-a-new-hires-toolkit, June 2018, (Accessed on 02/24/2021).
J. Palmer, "Test flakiness-methods for identifying and dealing with flaky tests: Spotify engineering," https://engineering.atspotify.com/2019/11/18/test-flakiness-methods-for-identifying-and-dealing-with-flaky-tests/, (Accessed on 02/25/2021).
S. Liviu, "A machine learning solution for detecting and mitigating flaky tests-engineering fitness," https://eng.fitbit.com/a-machine-learning-solution-for-detecting-and-mitigating-flaky-tests/, (Accessed on 02/25/2021).
Fuchsia, "Flaky test policy," https://fuchsia.dev/fuchsia-src/concepts/testing/test flake policy, February 2021, (Accessed on 02/25/2021).
J. Micco, "Flaky tests at google and how we mitigate them-googblogs.com," https://www.googblogs.com/flaky-tests-at-google-and-how-we-mitigate-them/, May 2016, (Accessed on 02/24/2021).
Thethinkingtester, "Think like a tester: Your flaky tests are destroying trust," http://thethinkingtester.blogspot.com/2019/10/your-flaky-tests-are-destroying-trust.html, October 2019, (Accessed on 02/24/2021).
A. McPeak, "flaky tests archives-crossbrowsertesting.com," https://crossbrowsertesting.com/blog/tag/flaky-tests/, February 2018, (Accessed on 02/24/2021).
B. Lee, "We have a flaky test problem. flaky tests are insidious. fighting.-by bryan lee-scope-medium," https://medium.com/scopedev/how-can-we-peacefully-co-exist-with-flaky-tests-3c8f94fba166, November 2019, (Accessed on 02/24/2021).
C. Wong, J. Meinicke, L. Lazarek, and C. Kästner, "Faster variational execution with transparent bytecode transformation," Proc. ACM Program. Lang., vol. 2, no. OOPSLA, pp. 117:1-117:30, 2018. [Online]. Available: https://doi.org/10.1145/3276487
W. Lam, S. Winter, A. Astorga, V. Stodden, and D. Marinov, "Understanding Reproducibility and Characteristics of Flaky Tests Through Test Reruns in Java Projects," pp. 403-413, 2020.
T. M. King, D. Santiago, J. Phillips, and P. J. Clarke, "Towards a Bayesian Network Model for Predicting Flaky Automated Tests," 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 100-107, 2018.
A. Bertolino, E. Cruciani, B. Miranda, and R. Verdecchia, "Know Your Neighbor: Fast Static Prediction of Test Flakiness," Proceedings of the International Conference on Software Engineering (ICSE), 2020. [Online]. Available: https://ieeexplore.ieee.org
M. Harman and P. O'Hearn, "From start-ups to scale-ups: Opportunities and open problems for static and dynamic program analysis," in 2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 2018, pp. 1-23.
D. Silva, L. Teixeira, and M. D'Amorim, "Shake It! Detecting Flaky Tests Caused by Concurrency with Shaker," Proceedings-2020 IEEE International Conference on Software Maintenance and Evolution, IC-SME 2020, pp. 301-311, 2020.