Doctoral thesis (Dissertations and theses)
Learning Code Change Semantics for Patch Correctness Assessment in Program Repair
TIAN, Haoye
2023
 

Files


Full Text
PhD_Dissertation_Haoye Tian.pdf
Author postprint (5.63 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Program repair; Patch correctness; LLMs; Patch overfitting; Representation learning
Abstract :
[en] State-of-the-art APR techniques currently produce patches that are manually evaluated as overfitting, and these overfitting patches often worsen the original program, leading to negative effects such as introducing security vulnerabilities and removing useful features. This obstructs the development of APR techniques that rely on feedback from correctly generated patches, and the expense of developers’ manual debugging has shifted to evaluating patch correctness. Automated assessment of patch correctness has the potential to reduce patch validation costs and accelerate the identification of practically correct patches, making it easier for developers to adopt APR techniques. While the proposed approaches have been demonstrated to be effective in the literature, several challenges remain unexplored and warrant further investigation. This thesis begins with an empirical analysis of a prevalent hypothesis concerning patch correctness, leading to the establishment of a patch correctness prediction framework based on representation learning. Second, we propose to validate correct patches by proposing a novel heuristic on the relationship between patches and their associated failing test cases. Lastly, we present a novel perspective to assess patch correctness with natural language processing. Our contributions to the research field through this thesis are as follows: 1) assessing the feasibility of utilizing advancements in deep representation learning to generate patch embeddings suitable for reasoning about correctness. Consequently, we establish Leopard, a supervised learning- based patch correctness prediction framework. 2) comparing code embeddings and engineered features for patch correctness prediction, and investigating their combination in Panther (an upgraded version of Leopard) for more accurate classification. Additionally, we use the SHAP explainability model to reveal the essential aspects of patch correctness by interpreting underlying causes of prediction performance across features and classifiers. 3) presenting and validating a key hypothesis: when different programs fail to pass similar test cases, it is likely that these programs require similar code changes. Based on this heuristic, we propose BATS, an approach predicting patch correctness by statically comparing generated patches against previous correct patches failing on similar tests. 4) proposing a novel perspective to patch correctness assessment: a correct patch implements changes that answer to the issue caused by the buggy behavior. By leveraging bug reports to offer an explicit description of the bug, we build Quatrain, a supervised learning approach that utilizes a deep NLP model to predict the relevance between a bug report and a patch description.
Disciplines :
Computer science
Author, co-author :
TIAN, Haoye ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Language :
English
Title :
Learning Code Change Semantics for Patch Correctness Assessment in Program Repair
Defense date :
15 September 2023
Institution :
Unilu - University of Luxembourg, Luxembourg
Degree :
Docteur en Informatique (DIP_DOC_0006_B)
Promotor :
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
President :
KLEIN, Jacques  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Jury member :
CORDY, Maxime  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SerVal
Le Goues Claire;  Carnegie Mellon University > Software and Societal Systems Department
Lo David;  Singapore Management University > Information Systems and Technology Cluster
Focus Area :
Security, Reliability and Trust
Available on ORBilu :
since 29 September 2023

Statistics


Number of views
161 (14 by Unilu)
Number of downloads
196 (14 by Unilu)

Bibliography


Similar publications



Contact ORBilu