References of "Titcheu Chekam, Thierry 50020742"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailSelecting Fault Revealing Mutants
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Bissyande, Tegawendé François D Assise UL et al

in Empirical Software Engineering (in press)

Detailed reference viewed: 281 (35 UL)
Full Text
Peer Reviewed
See detailCerebro: Static Subsuming Mutant Selection
Garg, Aayush UL; Ojdanic, Milos UL; Degiovanni, Renzo Gaston UL et al

in IEEE Transactions on Software Engineering (2022)

Detailed reference viewed: 142 (36 UL)
Full Text
Peer Reviewed
See detailKilling Stubborn Mutants with Symbolic Execution
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Cordy, Maxime UL et al

in ACM Transactions on Software Engineering and Methodology (2021), 30(2), 191--1923

Detailed reference viewed: 302 (16 UL)
Full Text
Peer Reviewed
See detailCommit-Aware Mutation Testing
Ma, Wei UL; Laurent, Thomas; Ojdanic, Milos UL et al

in IEEE International Conference on Software Maintenance and Evolution (ICSME) (2020)

Detailed reference viewed: 250 (64 UL)
Full Text
Peer Reviewed
See detailMuteria: An Extensible and Flexible Multi-Criteria Software Testing Framework
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Le Traon, Yves UL

in ACM/IEEE International Conference on Automation of Software Test (AST) 2020 (2020)

Program based test adequacy criteria (TAC), such as statement, branch coverage and mutation give objectives for software testing. Many techniques and tools have been developed to improve each phase of the ... [more ▼]

Program based test adequacy criteria (TAC), such as statement, branch coverage and mutation give objectives for software testing. Many techniques and tools have been developed to improve each phase of the TAC-based software testing process. Nonetheless, The engineering effort required to integrate these tools and techniques into the software testing process limits their use and creates an overhead to the users. Especially for system testing with languages like C, where test cases are not always well structured in a framework. In response to these challenges, this paper presents Muteria, a TAC-based software testing framework. Muteria enables the integration of multiple software testing tools. Muteria abstracts each phase of the TAC-based software testing process to provide tool drivers interfaces for the implementation of tool drivers. Tool drivers enable Muteria to call the corresponding tools during the testing process. An initial set of drivers for KLEE, Shadow and SEMu test-generation tools, Gcov, and coverage.py code coverage tools, and Mart mutant generation tool for C and Python programming language were implemented with an average of 345 lines of Python code. Moreover, the user configuration file required to measure code coverage and mutation score on a sample C programs, using the Muteria framework, consists of less than 15 configuration variables. Users of the Muteria framework select, in a configuration file, the tools and TACs to measure. The Muteria framework uses the user configuration to run the testing process and report the outcome. Users interact with Muteria through its Application Programming Interface and Command Line Interface. Muteria can benefit to researchers as a laboratory to execute experiments, and to software practitioners. [less ▲]

Detailed reference viewed: 202 (8 UL)
Full Text
Peer Reviewed
See detailSelecting fault revealing mutants
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Bissyande, Tegawendé François D Assise UL et al

in Empirical Software Engineering (2020)

Detailed reference viewed: 174 (17 UL)
Full Text
Peer Reviewed
See detailSelecting fault revealing mutants
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Bissyande, Tegawendé François D Assise UL et al

in Empirical Software Engineering (2019)

Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault ... [more ▼]

Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings). [less ▲]

Detailed reference viewed: 107 (7 UL)
Full Text
See detailAssessment and Improvement of the Practical Use of Mutation for Automated Software Testing
Titcheu Chekam, Thierry UL

Doctoral thesis (2019)

Software testing is the main quality assurance technique used in software engineering. In fact, companies that develop software and open-source communities alike actively integrate testing into their ... [more ▼]

Software testing is the main quality assurance technique used in software engineering. In fact, companies that develop software and open-source communities alike actively integrate testing into their software development life cycle. In order to guide and give objectives for the software testing process, researchers have designed test adequacy criteria (TAC) which, define the properties of a software that must be covered in order to constitute a thorough test suite. Many TACs have been designed in the literature, among which, the widely used statement and branch TAC, as well as the fault-based TAC named mutation. It has been shown in the literature that mutation is effective at revealing fault in software, nevertheless, mutation adoption in practice is still lagging due to its cost. Ideally, TACs that are most likely to lead to higher fault revelation are desired for testing and, the fault-revelation of test suites is expected to increase as their coverage of TACs test objectives increase. However, the question of which TAC best guides software testing towards fault revelation remains controversial and open, and, the relationship between TACs test objectives’ coverage and fault-revelation remains unknown. In order to increase knowledge and provide answers about these issues, we conducted, in this dissertation, an empirical study that evaluates the relationship between test objectives’ coverage and fault-revelation for four TACs (statement, branch coverage and, weak and strong mutation). The study showed that fault-revelation increase with coverage only beyond some coverage threshold and, strong mutation TAC has highest fault revelation. Despite the benefit of higher fault-revelation that strong mutation TAC provide for software testing, software practitioners are still reluctant to integrate strong mutation into their software testing activities. This happens mainly because of the high cost of mutation analysis, which is related to the large number of mutants and the limitation in the automation of test generation for strong mutation. Several approaches have been proposed, in the literature, to tackle the analysis’ cost issue of strong mutation. Mutant selection (reduction) approaches aim to reduce the number of mutants used for testing by selecting a small subset of mutation operator to apply during mutants generation, thus, reducing the number of analyzed mutants. Nevertheless, those approaches are not more effective, w.r.t. fault-revelation, than random mutant sampling (which leads to a high loss in fault revelation). Moreover, there is not much work in the literature that regards cost-effective automated test generation for strong mutation. This dissertation proposes two techniques, FaRM and SEMu, to reduce the cost of mutation testing. FaRM statically selects and prioritizes mutants that lead to faults (fault-revealing mutants), in order to reduce the number of mutants (fault-revealing mutants represent a very small proportion of the generated mutants). SEMu automatically generates tests that strongly kill mutants and thus, increase the mutation score and improve the test suites. First, this dissertation makes an empirical study that evaluates the fault-revelation (ability to lead to tests that have high fault-revelation) of four TACs, namely statement, branch, weak mutation and strong mutation. The outcome of the study show evidence that for all four studied TACs, the fault-revelation increases with TAC test objectives’ coverage only beyond a certain threshold of coverage. This suggests the need to attain higher coverage during testing. Moreover, the study shows that strong mutation is the only studied TAC that leads to tests that have, significantly, the highest fault-revelation. Second, in line with mutant reduction, we study the different mutant quality indicators (used to qualify "useful" mutants) proposed in the literature, including fault-revealing mutants. Our study shows that there is a large disagreement between the indicators suggesting that the fault-revealing mutant set is unique and differs from other mutant sets. Thus, given that testing aims to reveal faults, one should directly target fault-revealing mutants for mutant reduction. We also do so in this dissertation. Third, this dissertation proposes FaRM, a mutant reduction technique based on supervised machine learning. In order to automatically discriminate, before test execution, between useful (valuable) and useless mutants, FaRM build a mutants classification machine learning model. The features for the classification model are static program features of mutants categorized as mutant types and mutant context (abstract syntax tree, control flow graph and data/control dependency information). FaRM’s classification model successfully predicted fault-revealing mutants and killable mutants. Then, in order to reduce the number of analyzed mutants, FaRM selects and prioritizes fault-revealing mutants based of the aforementioned mutants classification model. An empirical evaluation shows that FaRM outperforms (w.r.t. the accuracy of fault-revealing mutant selection) random mutants sampling and existing mutation operators-based mutant selection techniques. Fourth, this dissertation proposes SEMu, an automated test input generation technique aiming to increase strong mutation coverage score of test suites. SEMu is based on symbolic execution and leverages multiple cost reduction heuristics for the symbolic execution. An empirical evaluation shows that, for limited time budget, the SEMu generates tests that successfully increase strong mutation coverage score and, kill more mutants than test generated by state-of-the-art techniques. Finally, this dissertation proposes Muteria a framework that enables the integration of FaRM and SEMu into the automated software testing process. Overall, this dissertation provides insights on how to effectively use TACs to test software, shows that strong mutation is the most effective TAC for software testing. It also provides techniques that effectively facilitate the practical use of strong mutation and, an extensive tooling to support the proposed techniques while enabling their extensions for the practical adoption of strong mutation in software testing. [less ▲]

Detailed reference viewed: 327 (36 UL)
Full Text
Peer Reviewed
See detailMart: A Mutant Generation Tool for LLVM
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Le Traon, Yves UL

in Titcheu Chekam, Thierry; Papadakis, Mike; Le Traon, Yves (Eds.) Mart: A Mutant Generation Tool for LLVM (2019)

Program mutation makes small syntactic alterations to programs' code in order to artificially create faulty programs (mutants). Mutants are used, in software analysis, to evaluate and improve test suites ... [more ▼]

Program mutation makes small syntactic alterations to programs' code in order to artificially create faulty programs (mutants). Mutants are used, in software analysis, to evaluate and improve test suites. Mutants creation (generation) tools are often characterized by their mutation operators and the way they create and represent the mutants. This paper presents Mart, a mutants generation tool, for LLVM bitcode, that supports the fine-grained definition of mutation operators (as matching rule - replacing pattern pair; uses 816 defined pairs by default) and the restriction of the code parts to mutate. New operators are implemented in Mart by implementing their matching rules and replacing patterns. Mart also implements in-memory Trivial Compiler Equivalence to eliminate equivalent and duplicate mutants during mutants generation. Mart generates mutant code as separated mutant files, meta-mutants file, weak mutation, and mutant coverage instrumented files. The generated LLVM bitcode files can be interpreted using an LLVM interpreter or compiled into native code. Mart is publicly available (https://github.com/thierry-tct/mart) for use by researchers and practitioners. Mart has been applied to generate mutants for several research experiments and generated more than 4,000,000 mutants. [less ▲]

Detailed reference viewed: 307 (21 UL)
Full Text
Peer Reviewed
See detailAre mutants really natural? A study on how “naturalness” helps mutant selection
Jimenez, Matthieu UL; Titcheu Chekam, Thierry UL; Cordy, Maxime UL et al

in Proceedings of 12th International Symposium on 
 Empirical Software Engineering and Measurement (ESEM'18) (2018, October 11)

Background: Code is repetitive and predictable in a way that is similar to the natural language. This means that code is ``natural'' and this ``naturalness'' can be captured by natural language modelling ... [more ▼]

Background: Code is repetitive and predictable in a way that is similar to the natural language. This means that code is ``natural'' and this ``naturalness'' can be captured by natural language modelling techniques. Such models promise to capture the program semantics and identify source code parts that `smell', i.e., they are strange, badly written and are generally error-prone (likely to be defective). Aims: We investigate the use of natural language modelling techniques in mutation testing (a testing technique that uses artificial faults). We thus, seek to identify how well artificial faults simulate real ones and ultimately understand how natural the artificial faults can be. %We investigate this question in a fault revelation perspective. Our intuition is that natural mutants, i.e., mutants that are predictable (follow the implicit coding norms of developers), are semantically useful and generally valuable (to testers). We also expect that mutants located on unnatural code locations (which are generally linked with error-proneness) to be of higher value than those located on natural code locations. Method: Based on this idea, we propose mutant selection strategies that rank mutants according to a) their naturalness (naturalness of the mutated code), b) the naturalness of their locations (naturalness of the original program statements) and c) their impact on the naturalness of the code that they apply to (naturalness differences between original and mutated statements). We empirically evaluate these issues on a benchmark set of 5 open-source projects, involving more than 100k mutants and 230 real faults. Based on the fault set we estimate the utility (i.e. capability to reveal faults) of mutants selected on the basis of their naturalness, and compare it against the utility of randomly selected mutants. Results: Our analysis shows that there is no link between naturalness and the fault revelation utility of mutants. We also demonstrate that the naturalness-based mutant selection performs similar (slightly worse) to the random mutant selection. Conclusions: Our findings are negative but we consider them interesting as they confute a strong intuition, i.e., fault revelation is independent of the mutants' naturalness. [less ▲]

Detailed reference viewed: 201 (23 UL)
Full Text
Peer Reviewed
See detailOn the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems
Ruan, Mingkang; Titcheu Chekam, Thierry UL; Zhai, Ennan et al

in IEEE Transactions on Parallel and Distributed Systems (2018), PP(99), 1-1

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each object across multiple storage nodes and leverage object sync protocols to achieve high ... [more ▼]

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each object across multiple storage nodes and leverage object sync protocols to achieve high reliability and eventual consistency. The performance of object sync protocols heavily relies on two key parameters: r (number of replicas for each object) and n (number of objects hosted by each storage node). In existing tutorials and demos, the configurations are usually r = 3 and n < 1000 by default, and the sync process seems to perform well. However, we discover in data-intensive scenarios, e.g., when r > 3 and n >> 1000, the sync process is significantly delayed and produces massive network overhead, referred to as the sync bottleneck problem. By reviewing the source code of OpenStack Swift, we find that its object sync protocol utilizes a fairly simple and network-intensive approach to check the consistency among replicas of objects. Hence in a sync round, the number of exchanged hash values per node is Theta(n x r). To tackle the problem, we propose a lightweight and practical object sync protocol, LightSync, which not only remarkably reduces the sync overhead, but also preserves high reliability and eventual consistency. LightSync derives this capability from three novel building blocks: 1) Hashing of Hashes, which aggregates all the h hash values of each data partition into a single but representative hash value with the Merkle tree; 2) Circular Hash Checking, which checks the consistency of different partition replicas by only sending the aggregated hash value to the clockwise neighbor; and 3) Failed Neighbor Handling, which properly detects and handles node failures with moderate overhead to effectively strengthen the robustness of LightSync. The design of LightSync offers provable guarantee on reducing the per-node network overhead from Theta(n x r) to Theta(n/h). Furthermore, we have implemented LightSync as an open-source patch and adopted it to OpenStack Swift, thus reducing the sync delay by up to 879x and the network overhead by up to 47.5x. [less ▲]

Detailed reference viewed: 170 (7 UL)
Full Text
Peer Reviewed
See detailPredicting the Fault Revelation Utility of Mutants
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Bissyande, Tegawendé François D Assise UL et al

in 40th International Conference on Software Engineering, Gothenburg, Sweden, May 27 - 3 June 2018 (2018)

Detailed reference viewed: 297 (22 UL)
Full Text
Peer Reviewed
See detailMutant Quality Indicators
Papadakis, Mike UL; Titcheu Chekam, Thierry UL; Le Traon, Yves UL

in 13th International Workshop on Mutation Analysis (MUTATION'18) (2018)

Detailed reference viewed: 306 (20 UL)
Full Text
Peer Reviewed
See detailAn Empirical Study on Mutation, Statement and Branch Coverage Fault Revelation that Avoids the Unreliable Clean Program Assumption
Titcheu Chekam, Thierry UL; Papadakis, Mike UL; Le Traon, Yves UL et al

in International Conference on Software Engineering (ICSE 2017) (2017, May 28)

Many studies suggest using coverage concepts, such as branch coverage, as the starting point of testing, while others as the most prominent test quality indicator. Yet the relationship between coverage ... [more ▼]

Many studies suggest using coverage concepts, such as branch coverage, as the starting point of testing, while others as the most prominent test quality indicator. Yet the relationship between coverage and fault-revelation remains unknown, yielding uncertainty and controversy. Most previous studies rely on the Clean Program Assumption, that a test suite will obtain similar coverage for both faulty and fixed (‘clean’) program versions. This assumption may appear intuitive, especially for bugs that denote small semantic deviations. However, we present evidence that the Clean Program Assumption does not always hold, thereby raising a critical threat to the validity of previous results. We then conducted a study using a robust experimental methodology that avoids this threat to validity, from which our primary finding is that strong mutation testing has the highest fault revelation of four widely-used criteria. Our findings also revealed that fault revelation starts to increase significantly only once relatively high levels of coverage are attained. [less ▲]

Detailed reference viewed: 454 (44 UL)
Full Text
See detailOn the Naturalness of Mutants
Jimenez, Matthieu UL; Cordy, Maxime UL; Kintis, Marinos UL et al

E-print/Working paper (2017)

Detailed reference viewed: 197 (15 UL)
Full Text
Peer Reviewed
See detailOn the Synchronization Bottleneck of OpenStack Swift-like Cloud Storage Systems
Titcheu Chekam, Thierry UL; Ennan, Zhai; Zhenhua, Li et al

in IEEE International Conference on Computer Communications, San Francisco, CA 10-15 April 2016 (2016, April)

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each data object across multiple storage nodes and leverage object sync protocols to achieve ... [more ▼]

As one type of the most popular cloud storage services, OpenStack Swift and its follow-up systems replicate each data object across multiple storage nodes and leverage object sync protocols to achieve high availability and eventual consistency. The performance of object sync protocols heavily relies on two key parameters: r (number of replicas for each object) and n (number of objects hosted by each storage node). In existing tutorials and demos, the configurations are usually r = 3 and n < 1000 by default, and the object sync process seems to perform well. To deep understand object sync protocols, we first make a lab-scale OpenStack Swift deployment and run experiments with various configurations. We discover that in data-intensive scenarios, e.g., when r > 3 and n >> 1000, the object sync process is significantly delayed and produces massive network overhead. This phenomenon is referred to as the sync bottleneck problem. Then, to explore the root cause, we review the source code of OpenStack Swift and find that its object sync protocol utilizes a fairly simple and network-intensive approach to check the consistency among replicas of objects. In particular, each storage node is required to periodically multicast the hash values of all its hosted objects to all the other replica nodes. Thus in a sync round, the number of exchanged hash values per node is Theta (n* r). Further, to tackle the problem, we propose a lightweight object sync protocol called LightSync. It remarkably reduces the sync overhead by using two novel building blocks: 1) Hashing of Hashes, which aggregates all the h hash values of each data partition into a single but representative hash value with the Merkle tree; 2) Circular Hash Checking, which checks the consistency of different partition replicas by only sending the aggregated hash value to the clockwise neighbor. Its design provably reduces the per-node network overhead from Theta(n* r) to Theta (n/h ). In addition, we have implemented LightSync as an opensource patch and adopted it to OpenStack Swift, thus reducing sync delay by up to 28.8X and network overhead by up to 14.2X . [less ▲]

Detailed reference viewed: 463 (27 UL)