Communication publiée dans un ouvrage (Colloques, congrès, conférences scientifiques et actes)
On the Impact of Flaky Tests in Automated Program Repair
Qin, Yihao; Wang, Shangwen; LIU, Kui et al.
2021In 28th IEEE International Conference on Software Analysis, Evolution and Reengineering, Hawaii 9-12 March 2021
Peer reviewed
 

Documents


Texte intégral
2021_SANER_Flakytests4APR.pdf
Postprint Auteur (414.31 kB)
Télécharger

Tous les documents dans ORBilu sont protégés par une licence d'utilisation.

Envoyer vers



Détails



Mots-clés :
Program repair; Flaky tests; Empirical assessment
Résumé :
[en] The literature of Automated Program Repair is largely dominated by approaches that leverage test suites not only to expose bugs but also to validate the generated patches. Unfortunately, beyond the widely-discussed concern that test suites are an imperfect oracle because they can be incomplete, they can include tests that are flaky. A flaky test is one that can be passed or failed by a program in a non-deterministic way. Such tests are generally carefully removed from the repair benchmarks. In practice, however, flaky tests are available test suite of software repositories. To the best of our knowledge, no study has discussed this threat to validity for evaluation of program repair. In this work, we highlight this threat and further investigate the impact of flaky tests by reverting their removal from the Defects4J benchmark. Our study aims to characterize the impact of flaky tests for localizing bugs and the eventual influence on the repair performance. Among other insights, we find that (1) although flaky tests are few (≈0.3%) of total tests, they affect experiments related to a large proportion (98.9%) of Defects4J real-world faults; (2) most flaky tests (98%) actually provide deterministic results under specific environment configurations (with the jdk version influencing the results); (3) flaky tests drastically hinder the effectiveness of spectrum-based fault localization (e.g., the rankings of 90 bugs drop down while none of the bugs obtains better location results compared with results achieved without flaky tests); and (4) the repairability of APR tools is greatly affected by the presence of flaky tests (e.g., 10 state of the art APR tools can now fix significantly fewer bugs than when the benchmark is manually curated to remove flaky tests). Given that the detection of flaky tests is still nascent, we call for the program repair community to relax the artificial assumption that the test suite is free from flaky tests. One direction that we propose is to consider developing strategies where patches that partially-fix bugs are considered worthwhile: a patch may make the program pass some test cases but fail some (which may actually be the flaky ones).
Disciplines :
Sciences informatiques
Auteur, co-auteur :
Qin, Yihao;  National University of Defense Technology, Changsha, China
Wang, Shangwen;  National University of Defense Technology, Changsha, China
LIU, Kui ;  Nanjing University of Aeronautics and Astronautics, Nanjing, China
Mao, Xiaoguang;  National University of Defense Technology, Changsha, China
BISSYANDE, Tegawendé François D Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
On the Impact of Flaky Tests in Automated Program Repair
Date de publication/diffusion :
10 mars 2021
Nom de la manifestation :
28th IEEE International Conference on Software Analysis, Evolution and Reengineering
Lieu de la manifestation :
Hawaii, Etats-Unis
Date de la manifestation :
9-03-2021 to 12-03-2021
Manifestation à portée :
International
Titre de l'ouvrage principal :
28th IEEE International Conference on Software Analysis, Evolution and Reengineering, Hawaii 9-12 March 2021
ISBN/EAN :
978-1-7281-9630-5
Pagination :
295-306
Peer reviewed :
Peer reviewed
Focus Area :
Security, Reliability and Trust
Disponible sur ORBilu :
depuis le 08 février 2022

Statistiques


Nombre de vues
101 (dont 0 Unilu)
Nombre de téléchargements
270 (dont 7 Unilu)

citations Scopus®
 
15
citations Scopus®
sans auto-citations
9
citations OpenAlex
 
20
citations WoS
 
16

Bibliographie


Publications similaires



Contacter ORBilu