Results 1-8 of 8.
((uid:50002091))

Bookmark and Share    
Full Text
Peer Reviewed
See detailA Closer Look at Real-World Patches
Liu, Kui UL; Kim, Dongsun UL; Koyuncu, Anil UL et al

Scientific Conference (2018, September)

Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown ... [more ▼]

Bug fixing is a time-consuming and tedious task. To reduce the manual efforts in bug fixing, researchers have presented automated approaches to software repair. Unfortunately, recent studies have shown that the state-of-the-art techniques in automated repair tend to generate patches only for a small number of bugs even with quality issues (e.g., incorrect behavior and nonsensical changes). To improve automated program repair (APR) techniques, the community should deepen its knowledge on repair actions from real-world patches since most of the techniques rely on patches written by human developers. Previous investigations on real-world patches are limited to statement level that is not sufficiently fine-grained to build this knowledge. In this work, we contribute to building this knowledge via a systematic and fine-grained study of 16,450 bug fix commits from seven Java open-source projects. We find that there are opportunities for APR techniques to improve their effectiveness by looking at code elements that have not yet been investigated. We also discuss nine insights into tuning automated repair tools. For example, a small number of statement and expression types are recurrently impacted by real-world patches, and expression-level granularity could reduce search space of finding fix ingredients, where previous studies never explored. [less ▲]

Detailed reference viewed: 5 (1 UL)
Full Text
Peer Reviewed
See detailFaCoY - A Code-to-Code Search Engine
Kim, Kisub UL; Kim, Dongsun UL; Bissyande, Tegawendé François D Assise UL et al

in International Conference on Software Engineering (ICSE 2018) (2018, May 27)

Code search is an unavoidable activity in software development. Various approaches and techniques have been explored in the literature to support code search tasks. Most of these approaches focus on ... [more ▼]

Code search is an unavoidable activity in software development. Various approaches and techniques have been explored in the literature to support code search tasks. Most of these approaches focus on serving user queries provided as natural language free-form input. However, there exists a wide range of use-case scenarios where a code-to-code approach would be most beneficial. For example, research directions in code transplantation, code diversity, patch recommendation can leverage a code-to-code search engine to find essential ingredients for their techniques. In this paper, we propose FaCoY, a novel approach for statically finding code fragments which may be semantically similar to user input code. FaCoY implements a query alternation strategy: instead of directly matching code query tokens with code in the search space, FaCoY first attempts to identify other tokens which may also be relevant in implementing the functional behavior of the input code. With various experiments, we show that (1) FaCoY is more effective than online code-to-code search engines; (2) FaCoY can detect more semantic code clones (i.e., Type-4) in BigCloneBench than the state-of-theart; (3) FaCoY, while static, can detect code fragments which are indeed similar with respect to runtime execution behavior; and (4) FaCoY can be useful in code/patch recommendation. [less ▲]

Detailed reference viewed: 34 (2 UL)
Full Text
Peer Reviewed
See detailImpact of Tool Support in Patch Construction
Koyuncu, Anil UL; Bissyande, Tegawendé François D Assise UL; Kim, Dongsun UL et al

Scientific Conference (2017, July)

In this work, we investigate the practice of patch construction in the Linux kernel development, focusing on the differences between three patching processes: (1) patches crafted entirely manually to fix ... [more ▼]

In this work, we investigate the practice of patch construction in the Linux kernel development, focusing on the differences between three patching processes: (1) patches crafted entirely manually to fix bugs, (2) those that are derived from warnings of bug detection tools, and (3) those that are automatically generated based on fix patterns. With this study, we provide to the research community concrete insights on the practice of patching as well as how the development community is currently embracing research and commercial patching tools to improve productivity in repair. The result of our study shows that tool-supported patches are increasingly adopted by the developer community while manually-written patches are accepted more quickly. Patch application tools enable developers to remain committed to contributing patches to the code base. Our findings also include that, in actual development processes, patches generally implement several change operations spread over the code, even for patches fixing warnings by bug detection tools. Finally, this study has shown that there is an opportunity to directly leverage the output of bug detection tools to readily generate patches that are appropriate for fixing the problem, and that are consistent with manually-written patches. [less ▲]

Detailed reference viewed: 91 (10 UL)
Full Text
See detailAugmenting and Structuring User Queries to Support Efficient Free-Form Code Search
Sirres, Raphael; Bissyande, Tegawendé François D Assise UL; Kim, Dongsun UL et al

Report (2017)

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this ... [more ▼]

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this paper, we present Code voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories. To instantiate this approach, we build GitSearch, a code search engine, on top of GitHub and StackOverflow Q\&A data. We evaluate GitSearch in several dimensions to demonstrate that (1) its code search results are correct with respect to user-accepted answers; (2) the results are qualitatively better than those of existing Internet-scale code search engines; (3) our engine is competitive against web search engines, such as Google, in helping users complete solve programming tasks; and (4) GitSearch provides code examples that are acceptable or interesting to the community as answers for StackOverflow questions. [less ▲]

Detailed reference viewed: 177 (29 UL)
Full Text
See detailWatch out for This Commit! A Study of Influential Software Changes
Li, Daoyuan UL; Li, Li UL; Kim, Dongsun UL et al

Report (2016)

One single code change can significantly influence a wide range of software systems and their users. For example, 1) adding a new feature can spread defects in several modules, while 2) changing an API ... [more ▼]

One single code change can significantly influence a wide range of software systems and their users. For example, 1) adding a new feature can spread defects in several modules, while 2) changing an API method can improve the performance of all client programs. Developers often may not clearly know whether their or others’ changes are influential at commit time. Rather, it turns out to be influential after affecting many aspects of a system later. This paper investigates influential software changes and proposes an approach to identify them early, i.e., immediately when they are applied. We first conduct a post- mortem analysis to discover existing influential changes by using intuitions such as isolated changes and changes referred by other changes in 10 open source projects. Then we re-categorize all identified changes through an open-card sorting process. Subsequently, we conduct a survey with 89 developers to confirm our influential change categories. Finally, from our ground truth we extract features, including metrics such as the complexity of changes, terms in commit logs and file centrality in co-change graphs, to build ma- chine learning classifiers. The experiment results show that our prediction model achieves overall with random samples 86.8% precision, 74% recall and 80.4% F-measure respectively. [less ▲]

Detailed reference viewed: 127 (19 UL)
Full Text
Peer Reviewed
See detailAutomatic Identifier Inconsistency Detection Using Code Dictionary
Kim, Suntae; Kim, Dongsun UL

in Empirical Software Engineering (2016), 21(2), 565-604

Inconsistent identifiers make it difficult for developers to understand source code. In particular, large software systems written by several developers can be vulnerable to identifier inconsistency ... [more ▼]

Inconsistent identifiers make it difficult for developers to understand source code. In particular, large software systems written by several developers can be vulnerable to identifier inconsistency. Unfortunately, it is not easy to detect inconsistent identifiers that are already used in source code. Although several techniques have been proposed to address this issue, many of these techniques can result in false alarms since such techniques do not accept domain words and idiom identifiers that are widely used in programming practice. This paper proposes an approach to detecting inconsistent identifiers based on a custom code dictionary. It first automatically builds a Code Dictionary from the existing API documents of popular Java projects by using an Natural Language Processing (NLP) parser. This dictionary records domain words with dominant part-of-speech (POS) and idiom identifiers. This set of domain words and idioms can improve the accuracy when detecting inconsistencies by reducing false alarms. The approach then takes a target program and detects inconsistent identifiers of the program by leveraging the Code Dictionary. We provide CodeAmigo, a GUI-based tool support for our approach. We evaluated our approach on seven Java based open-/proprietarysource projects. The results of the evaluations show that the approach can detect inconsistent identifiers with 85.4% precision and 83.59% recall values. In addition, we conducted an interview with developers who used our approach, and the interview confirmed that inconsistent identifiers frequently and inevitably occur in most software projects. The interviewees then stated that our approach can help to better detect inconsistent identifiers that would have been missed through manual detection. [less ▲]

Detailed reference viewed: 120 (13 UL)
Full Text
Peer Reviewed
See detail“Overloaded!” — A Model-based Approach to Database Stress Testing
Meira, Jorge Augusto UL; Almeira, Eduardo Cunha de; Kim, Dongsun UL et al

in International Conference on Database and Expert Systems Applications, Porto 5-8 September 2016 (2016)

Detailed reference viewed: 58 (1 UL)
Peer Reviewed
See detailAPI Document Quality for Resolving Deprecated APIs
Ko, Deokyoon; Ma, Kyeongwook; Park, Sooyong et al

Scientific Conference (2014, December 01)

Detailed reference viewed: 121 (11 UL)