![]() Tian, Haoye ![]() ![]() in ACM Transactions on Software Engineering and Methodology (2022) Detailed reference viewed: 14 (2 UL)![]() ; ; Koyuncu, Anil ![]() in Journal of Systems and Software (2021) Detailed reference viewed: 141 (6 UL)![]() Koyuncu, Anil ![]() Doctoral thesis (2020) Automated program repair (APR) attracts a huge interest from research and industry as the ultimate target in automation of software maintenance. Towards realizing this automation promise, the research ... [more ▼] Automated program repair (APR) attracts a huge interest from research and industry as the ultimate target in automation of software maintenance. Towards realizing this automation promise, the research community has explored various ideas and techniques, which are increasingly demonstrating that APR is no longer fictional. Although literature techniques constantly set new records in fixing a significant fraction of defects within well-established benchmarks, we are not aware of large-scale adoption of APR in practice. Meanwhile, open-source and commercial organizations have started to reflect on the potential of integrating some automated steps in the software development cycle. Actually, the current practice has several development settings that use a number of tools to automate and systematize various tasks such as code style checking, bug detection, and systematic patching. Our work is motivated by this fact. We advocate that systematic and empirical exploration of the current practice that leverage tools to automate debugging tasks would provide valuable insights for rethinking and boosting the APR agenda towards its acceptability by developer communities. We have identified three investigation axes in this dissertation. First, mining software repositories towards understanding code change properties that could be valuable to guide program repair. Second, analyzing communication channels in software development in order to assess to what extent they could be relevant in a real-world program repair scenario. Third, exploring generic concepts of patching in the literature for establishing a common foundation for program repair pipelines that can be integrated with industrial settings. This dissertation makes the following contributions to the community: • An empirical study of tool support in a real development setting providing concrete insights on the acceptance, stability and the nature of bugs being fixed by manually-craft patches vs tool-supported patches and manifests opportunities for improving automated repair techniques. • A novel information retrieval based bug localization approach that learns how to compute the similarity scores of various types of features. • An automated mining strategy to infer fix pattern that can be integrated to automated program repair pipelines. • A practical bug report driven program repair pipeline. [less ▲] Detailed reference viewed: 207 (18 UL)![]() Liu, Kui ![]() ![]() in 42nd ACM/IEEE International Conference on Software Engineering (ICSE) (2020, May) Test-based automated program repair has been a prolific field of research in software engineering in the last decade. Many approaches have indeed been proposed, which leverage test suites as a weak, but ... [more ▼] Test-based automated program repair has been a prolific field of research in software engineering in the last decade. Many approaches have indeed been proposed, which leverage test suites as a weak, but affordable, approximation to program specifications. Although the literature regularly sets new records on the number of benchmark bugs that can be fixed, several studies increasingly raise concerns about the limitations and biases of state-of-the-art approaches. For example, the correctness of generated patches has been questioned in a number of studies, while other researchers pointed out that evaluation schemes may be misleading with respect to the processing of fault localization results. Nevertheless, there is little work addressing the efficiency of patch generation, with regard to the practicality of program repair. In this paper, we fill this gap in the literature, by providing an extensive review on the efficiency of test suite based program repair. Our objective is to assess the number of generated patch candidates, since this information is correlated to (1) the strategy to traverse the search space efficiently in order to select sensical repair attempts, (2) the strategy to minimize the test effort for identifying a plausible patch, (3) as well as the strategy to prioritize the generation of a correct patch. To that end, we perform a large-scale empirical study on the efficiency, in terms of quantity of generated patch candidates of the 16 open-source repair tools for Java programs. The experiments are carefully conducted under the same fault localization configurations to limit biases. Eventually, among other findings, we note that: (1) many irrelevant patch candidates are generated by changing wrong code locations; (2) however, if the search space is carefully triaged, fault localization noise has little impact on patch generation efficiency; (3) yet, current template-based repair systems, which are known to be most effective in fixing a large number of bugs, are actually least efficient as they tend to generate majoritarily irrelevant patch candidates. [less ▲] Detailed reference viewed: 260 (18 UL)![]() Koyuncu, Anil ![]() ![]() ![]() in Empirical Software Engineering (2020) Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across ... [more ▼] Patching is a common activity in software development. It is generally performed on a source code base to address bugs or add new functionalities. In this context, given the recurrence of bugs across projects, the associated similar patches can be leveraged to extract generic fix actions. While the literature includes various approaches leveraging similarity among patches to guide program repair, these approaches often do not yield fix patterns that are tractable and reusable as actionable input to APR systems. In this paper, we propose a systematic and automated approach to mining relevant and actionable fix patterns based on an iterative clustering strategy applied to atomic changes within patches. The goal of FixMiner is thus to infer separate and reusable fix patterns that can be leveraged in other patch generation systems. Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree structure of the edit scripts that captures the ASTlevel context of the code changes. FixMiner uses different tree representations of Rich Edit Scripts for each round of clustering to identify similar changes. These are abstract syntax trees, edit actions trees, and code context trees. We have evaluated FixMiner on thousands of software patches collected from open source projects. Preliminary results show that we are able to mine accurate patterns, efficiently exploiting change information in Rich Edit Scripts. We further integrated the mined patterns to an automated program repair prototype, PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J benchmark. Beyond this quantitative performance, we show that the mined fix patterns are sufficiently relevant to produce patches with a high probability of correctness: 81% of PARFixMiner’s generated plausible patches are correct. [less ▲] Detailed reference viewed: 139 (7 UL)![]() Tian, Haoye ![]() ![]() ![]() in Tian, Haoye (Ed.) 35th IEEE/ACM International Conference on Automated Software Engineering, September 21-25, 2020, Melbourne, Australia (2020) A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect ... [more ▼] A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by the oracle, may actually be incorrect. While the state of the art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study the benefit of learning code representations to learn deep features that may encode the properties of patch correctness. Our work mainly investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations. We report on findings based on embeddings produced by pre-trained and re-trained neural networks. Experimental results demonstrate the potential of embeddings to empower learning algorithms in reasoning about patch correctness: a machine learning predictor with BERT transformer-based embeddings... [less ▲] Detailed reference viewed: 114 (34 UL)![]() Koyuncu, Anil ![]() ![]() ![]() in ESEC/FSE 2019 Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2019, August) Issue tracking systems are commonly used in modern software development for collecting feedback from users and developers. An ultimate automation target of software maintenance is then the systematization ... [more ▼] Issue tracking systems are commonly used in modern software development for collecting feedback from users and developers. An ultimate automation target of software maintenance is then the systematization of patch generation for user-reported bugs. Although this ambition is aligned with the momentum of automated program repair, the literature has, so far, mostly focused on generate-and- validate setups where fault localization and patch generation are driven by a well-defined test suite. On the one hand, however, the common (yet strong) assumption on the existence of relevant test cases does not hold in practice for most development settings: many bugs are reported without the available test suite being able to reveal them. On the other hand, for many projects, the number of bug reports generally outstrips the resources available to triage them. Towards increasing the adoption of patch generation tools by practitioners, we investigate a new repair pipeline, iFixR, driven by bug reports: (1) bug reports are fed to an IR-based fault localizer; (2) patches are generated from fix patterns and validated via regression testing; (3) a prioritized list of generated patches is proposed to developers. We evaluate iFixR on the Defects4J dataset, which we enriched (i.e., faults are linked to bug reports) and carefully-reorganized (i.e., the timeline of test-cases is naturally split). iFixR generates genuine/plausible patches for 21/44 Defects4J faults with its IR-based fault localizer. iFixR accurately places a genuine/plausible patch among its top-5 recommendation for 8/13 of these faults (without using future test cases in generation-and-validation). [less ▲] Detailed reference viewed: 185 (19 UL)![]() Liu, Kui ![]() ![]() in 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) (2019, July) We revisit the performance of template-based APR to build com-prehensive knowledge about the effectiveness of fix patterns, andto highlight the importance of complementary steps such as faultlocalization ... [more ▼] We revisit the performance of template-based APR to build com-prehensive knowledge about the effectiveness of fix patterns, andto highlight the importance of complementary steps such as faultlocalization or donor code retrieval. To that end, we first investi-gate the literature to collect, summarize and label recurrently-usedfix patterns. Based on the investigation, we buildTBar, a straight-forward APR tool that systematically attempts to apply these fixpatterns to program bugs. We thoroughly evaluateTBaron the De-fects4J benchmark. In particular, we assess the actual qualitative andquantitative diversity of fix patterns, as well as their effectivenessin yielding plausible or correct patches. Eventually, we find that,assuming a perfect fault localization,TBarcorrectly/plausibly fixes74/101 bugs. Replicating a standard and practical pipeline of APRassessment, we demonstrate thatTBarcorrectly fixes 43 bugs fromDefects4J, an unprecedented performance in the literature (includ-ing all approaches, i.e., template-based, stochastic mutation-basedor synthesis-based APR). [less ▲] Detailed reference viewed: 159 (11 UL)![]() Liu, Kui ![]() ![]() in 41st ACM/IEEE International Conference on Software Engineering (ICSE) (2019, May) To ensure code readability and facilitate software maintenance, program methods must be named properly. In particular, method names must be consistent with the corresponding method implementations ... [more ▼] To ensure code readability and facilitate software maintenance, program methods must be named properly. In particular, method names must be consistent with the corresponding method implementations. Debugging method names remains an important topic in the literature, where various approaches analyze commonalities among method names in a large dataset to detect inconsistent method names and suggest better ones. We note that the state-of-the-art does not analyze the implemented code itself to assess consistency. We thus propose a novel automated approach to debugging method names based on the analysis of consistency between method names and method code. The approach leverages deep feature representation techniques adapted to the nature of each artifact. Experimental results on over 2.1 million Java methods show that we can achieve up to 15 percentage points improvement over the state-of-the-art, establishing a record performance of 67.9% F1-measure in identifying inconsistent method names. We further demonstrate that our approach yields up to 25% accuracy in suggesting full names, while the state-of-the-art lags far behind at 1.1% accuracy. Finally, we report on our success in fixing 66 inconsistent method names in a live study on projects in the wild. [less ▲] Detailed reference viewed: 361 (21 UL)![]() Liu, Kui ![]() ![]() ![]() in The 12th IEEE International Conference on Software Testing, Verification and Validation (ICST-2019) (2019, April 24) Properly benchmarking Automated Program Repair (APR) systems should contribute to the development and adoption of the research outputs by practitioners. To that end, the research community must ensure ... [more ▼] Properly benchmarking Automated Program Repair (APR) systems should contribute to the development and adoption of the research outputs by practitioners. To that end, the research community must ensure that it reaches significant milestones by reliably comparing state-of-the-art tools for a better understanding of their strengths and weaknesses. In this work, we identify and investigate a practical bias caused by the fault localization (FL) step in a repair pipeline. We propose to highlight the different fault localization configurations used in the literature, and their impact on APR systems when applied to the Defects4J benchmark. Then, we explore the performance variations that can be achieved by "tweaking'' the FL step. Eventually, we expect to create a new momentum for (1) full disclosure of APR experimental procedures with respect to FL, (2) realistic expectations of repairing bugs in Defects4J, as well as (3) reliable performance comparison among the state-of-the-art APR systems, and against the baseline performance results of our thoroughly assessed kPAR repair tool. Our main findings include: (a) only a subset of Defects4J bugs can be currently localized by commonly-used FL techniques; (b) current practice of comparing state-of-the-art APR systems (i.e., counting the number of fixed bugs) is potentially misleading due to the bias of FL configurations; and (c) APR authors do not properly qualify their performance achievement with respect to the different tuning parameters implemented in APR systems. [less ▲] Detailed reference viewed: 251 (18 UL)![]() Liu, Kui ![]() ![]() in The 26th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER-2019) (2019, February 24) Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained ... [more ▼] Fix pattern-based patch generation is a promising direction in Automated Program Repair (APR). Notably, it has been demonstrated to produce more acceptable and correct patches than the patches obtained with mutation operators through genetic programming. The performance of pattern-based APR systems, however, depends on the fix ingredients mined from fix changes in development histories. Unfortunately, collecting a reliable set of bug fixes in repositories can be challenging. In this paper, we propose to investigate the possibility in an APR scenario of leveraging code changes that address violations by static bug detection tools. To that end, we build the AVATAR APR system, which exploits fix patterns of static analysis violations as ingredients for patch generation. Evaluated on the Defects4J benchmark, we show that, assuming a perfect localization of faults, AVATAR can generate correct patches to fix 34/39 bugs. We further find that AVATAR yields performance metrics that are comparable to that of the closely-related approaches in the literature. While AVATAR outperforms many of the state-of-the-art pattern-based APR systems, it is mostly complementary to current approaches. Overall, our study highlights the relevance of static bug finding tools as indirect contributors of fix ingredients for addressing code defects identified with functional test cases. [less ▲] Detailed reference viewed: 226 (18 UL)![]() Liu, Kui ![]() ![]() ![]() in 25th Asia-Pacific Software Engineering Conference (APSEC) (2018, December 07) Automated program repair (APR) has extensively been developed by leveraging search-based techniques, in which fix ingredients are explored and identified in different granularities from a specific search ... [more ▼] Automated program repair (APR) has extensively been developed by leveraging search-based techniques, in which fix ingredients are explored and identified in different granularities from a specific search space. State-of-the approaches often find fix ingredients by using mutation operators or leveraging manually-crafted templates. We argue that the fix ingredients can be searched in an online mode, leveraging code search techniques to find potentially-fixed versions of buggy code fragments from which repair actions can be extracted. In this study, we present an APR tool, LSRepair, that automatically explores code repositories to search for fix ingredients at the method-level granularity with three strategies of similar code search. Our preliminary evaluation shows that code search can drive a faster fix process (some bugs are fixed in a few seconds). LSRepair helps repair 19 bugs from the Defects4J benchmark successfully. We expect our approach to open new directions for fixing multiple-lines bugs. [less ▲] Detailed reference viewed: 305 (27 UL)![]() Liu, Kui ![]() ![]() ![]() in 34th IEEE International Conference on Software Maintenance and Evolution (ICSME) (2018, September) Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown ... [more ▼] Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown that the state-of-the-art techniques in automated repair tend to generate patches only for a small number of bugs even with quality issues (e.g., incorrect behavior and nonsensical changes). To improve automated program repair (APR) techniques, the community should deepen its knowledge on repair actions from real-world patches since most of the techniques rely on patches written by human developers. Previous investigations on real-world patches are limited to statement level that is not sufficiently fine-grained to build this knowledge. In this work, we contribute to building this knowledge via a systematic and fine-grained study of 16,450 bug fix commits from seven Java open-source projects. We find that there are opportunities for APR techniques to improve their effectiveness by looking at code elements that have not yet been investigated. We also discuss nine insights into tuning automated repair tools. For example, a small number of statement and expression types are recurrently impacted by real-world patches, and expression-level granularity could reduce search space of finding fix ingredients, where previous studies never explored. [less ▲] Detailed reference viewed: 215 (24 UL)![]() Koyuncu, Anil ![]() ![]() ![]() Scientific Conference (2017, July) In this work, we investigate the practice of patch construction in the Linux kernel development, focusing on the differences between three patching processes: (1) patches crafted entirely manually to fix ... [more ▼] In this work, we investigate the practice of patch construction in the Linux kernel development, focusing on the differences between three patching processes: (1) patches crafted entirely manually to fix bugs, (2) those that are derived from warnings of bug detection tools, and (3) those that are automatically generated based on fix patterns. With this study, we provide to the research community concrete insights on the practice of patching as well as how the development community is currently embracing research and commercial patching tools to improve productivity in repair. The result of our study shows that tool-supported patches are increasingly adopted by the developer community while manually-written patches are accepted more quickly. Patch application tools enable developers to remain committed to contributing patches to the code base. Our findings also include that, in actual development processes, patches generally implement several change operations spread over the code, even for patches fixing warnings by bug detection tools. Finally, this study has shown that there is an opportunity to directly leverage the output of bug detection tools to readily generate patches that are appropriate for fixing the problem, and that are consistent with manually-written patches. [less ▲] Detailed reference viewed: 250 (20 UL) |
||