Article (Scientific journals)
Learning the Relation between Code Features and Code Transforms with Structured Prediction
Yu, Zhongxing; Martinez, Matias; Chen, Zimin et al.
2023In IEEE Transactions on Software Engineering, 49 (7), p. 3872 - 3900
Peer Reviewed verified by ORBi
 

Files


Full Text
1907.09282v2.pdf
Author postprint (1.72 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
big code; Code transform; machine learning; program repair; Big code; Code; Codes transform; Computer bugs; Feature transform; Features extraction; Machine-learning; Predictive models; Program repair; Synthesizer; Software; Computer Science - Software Engineering; Computer Science - Learning; Computer Science - Programming Languages
Abstract :
[en] To effectively guide the exploration of the code transform space for automated code evolution techniques, we present in this article the first approach for structurally predicting code transforms at the level of AST nodes using conditional random fields (CRFs). Our approach first learns offline a probabilistic model that captures how certain code transforms are applied to certain AST nodes, and then uses the learned model to predict transforms for arbitrary new, unseen code snippets. Our approach involves a novel representation of both programs and code transforms. Specifically, we introduce the formal framework for defining the so-called AST-level code transforms and we demonstrate how the CRF model can be accordingly designed, learned, and used for prediction. We instantiate our approach in the context of repair transform prediction for Java programs. Our instantiation contains a set of carefully designed code features, deals with the training data imbalance issue, and comprises transform constraints that are specific to code. We conduct a large-scale experimental evaluation based on a dataset of bug fixing commits from real-world Java projects. The results show that when the popular evaluation metric top-3 is used, our approach predicts the code transforms with an accuracy varying from 41% to 53% depending on the transforms. Our model outperforms two baselines based on history probability and neural machine translation (NMT), suggesting the importance of considering code structure in achieving good prediction accuracy. In addition, a proof-of-concept synthesizer is implemented to concretize some repair transforms to get the final patches. The evaluation of the synthesizer on the Defects4j benchmark confirms the usefulness of the predicted AST-level repair transforms in producing high-quality patches.
Disciplines :
Computer science
Author, co-author :
Yu, Zhongxing ;  Shandong University, Jinan, China
Martinez, Matias ;  Universitat Politècnica de Catalunya, Barcelona, Spain
Chen, Zimin ;  KTH Royal Institute of Technology, Stockholm, Sweden
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Monperrus, Martin ;  KTH Royal Institute of Technology, Stockholm, Sweden
External co-authors :
yes
Language :
English
Title :
Learning the Relation between Code Features and Code Transforms with Structured Prediction
Publication date :
July 2023
Journal title :
IEEE Transactions on Software Engineering
ISSN :
0098-5589
eISSN :
1939-3520
Publisher :
Institute of Electrical and Electronics Engineers Inc.
Volume :
49
Issue :
7
Pages :
3872 - 3900
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBilu :
since 10 December 2024

Statistics


Number of views
76 (1 by Unilu)
Number of downloads
26 (0 by Unilu)

Scopus citations®
 
5
Scopus citations®
without self-citations
3
OpenCitations
 
1
OpenAlex citations
 
8
WoS citations
 
2

Bibliography


Similar publications



Contact ORBilu