Paper published in a journal (Scientific congresses, symposiums and conference proceedings)
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
SONG, Yewei; LOTHRITZ, Cedric; TANG, Xunzhu et al.
2024In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 38-46
Peer reviewed
 

Files


Full Text
2024.acl-short.3 (1).pdf
Publisher postprint (421.27 kB) Creative Commons License - Attribution, ShareAlike
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] This paper revisits recent code similarity evaluation metrics, particularly focusing on the application of Abstract Syntax Tree (AST) editing distance in diverse programming languages. In particular, we explore the usefulness of these metrics and compare them to traditional sequence similarity metrics. Our experiments showcase the effectiveness of AST editing distance in capturing intricate code structures, revealing a high correlation with established metrics. Furthermore, we explore the strengths and weaknesses of AST editing distance and prompt-based GPT similarity scores in comparison to BLEU score, execution match, and Jaccard Similarity. We propose, optimize, and publish an adaptable metric that demonstrates effectiveness across all tested languages, representing an enhanced version of Tree Similarity of Edit Distance (TSED).
Research center :
NCER-FT - FinTech National Centre of Excellence in Research
Disciplines :
Computer science
Author, co-author :
SONG, Yewei  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
LOTHRITZ, Cedric  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > TruX > Team Tegawendé François d A BISSYANDE
TANG, Xunzhu  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
KLEIN, Jacques  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
External co-authors :
yes
Language :
English
Title :
Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
Publication date :
11 August 2024
Event name :
The 62nd Annual Meeting of the Association for Computational Linguistics
Event organizer :
Association for Computational Linguistics
Event place :
Bangkok, Thailand
Event date :
from 11 to 17 August 2024
Audience :
International
Journal title :
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Publisher :
Association for Computational Linguistics
Pages :
38-46
Peer reviewed :
Peer reviewed
FnR Project :
NCER22/IS/16570468/NCERFT
FNR16229163 - LuxemBERT - Multilingual Nlp Coping With Luxembourg Specificities For The Financial Industry, 2021 (01/01/2022-31/12/2024) - Jacques Klein
Available on ORBilu :
since 13 November 2024

Statistics


Number of views
156 (18 by Unilu)
Number of downloads
35 (2 by Unilu)

Scopus citations®
 
1
Scopus citations®
without self-citations
1
OpenCitations
 
0
OpenAlex citations
 
5

Bibliography


Similar publications



Contact ORBilu