A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

HABEN, Guillaume; HABCHI, Sarra; PAPADAKIS, Mike; CORDY, Maxime; LE TRAON, Yves

doi:10.1109/MSR52588.2021.00034

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

HABEN, Guillaume; HABCHI, Sarra; PAPADAKIS, Mike et al.

2021 • In 18th International Conference on Mining Software Repositories

Peer reviewed

Permalink
https://hdl.handle.net/10993/46924

DOI
10.1109/MSR52588.2021.00034

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

MSR21_FlakyReplication.pdf

Author postprint (848.52 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Software testing; regression testing; flakiness

Abstract :

[en] Abstract—Industrial reports indicate that flaky tests are one of the primary concerns of software testing mainly due to the false signals they provide. To deal with this issue, researchers have developed tools and techniques aiming at (automatically) identifying flaky tests with encouraging results. However, to reach industrial adoption and practice, these techniques need to be replicated and evaluated extensively on multiple datasets, occasions and settings. In view of this, we perform a replication study of a recently proposed method that predicts flaky tests based on their vocabulary. We thus replicate the original study on three different dimensions. First we replicate the approach on the same subjects as in the original study but using a different evaluation methodology, i.e., we adopt a time-sensitive selection of training and test sets to better reflect the envisioned use case. Second, we consolidate the findings of the initial study by building a new dataset of 837 flaky tests from 9 projects in a different programming language, i.e., Python while the original study was in Java, which comforts the generalisability of the results. Third, we propose an extension to the original approach by experimenting with different features extracted from the Code Under Test. Our results demonstrate that a more robust validation has a consistent negative impact on the reported results of the original study, but, fortunately, these do not invalidate the key conclusions of the study. We also find re-assuring results that the vocabulary-based models can also be used to predict test flakiness in Python and that the information lying in the Code Under Test has a limited impact in the performance of the vocabulary-based models

Research center :

Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Security Design and Validation Research Group (SerVal)

Disciplines :

Computer science

Author, co-author :

HABEN, Guillaume ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

HABCHI, Sarra ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

PAPADAKIS, Mike ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Computer Science (DCS)

CORDY, Maxime ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

LE TRAON, Yves ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal

External co-authors :

Language :

English

Title :

A Replication Study on the Usability of Code Vocabulary in Predicting Flaky Tests

Publication date :

May 2021

Event name :

18th International Conference on Mining Software Repositories

Event date :

from 17-05-2021 to 19-05-2021

Audience :

International

Main work title :

18th International Conference on Mining Software Repositories

Peer reviewed :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Available on ORBilu :

since 26 April 2021

Statistics

Number of views

406 (44 by Unilu)

Number of downloads

884 (40 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

WoS citations^™