Paper published in a book (Scientific congresses, symposiums and conference proceedings)
Natural Language to Code: How Far Are We?
Wang, Shangwen; Geng, Mingyang; Lin, Bo et al.
2023In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)
Peer reviewed
 

Files


Full Text
3611643.3616323.pdf
Publisher postprint (955.65 kB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] A longstanding dream in software engineering research is to devise e ective approaches for automating development tasks based on developers’ informally-speci ed intentions. Such intentions are generally in the form of natural language descriptions. In recent literature, a number of approaches have been proposed to automate tasks such as code search and even code generation based on natural language inputs. While these approaches vary in terms of technical designs, their objective is the same: transforming a developer’s intention into source code. The literature, however, lacks a comprehensive understanding towards the e ectiveness of existing techniques as well as their complementarity to each other. We propose to ll this gap through a large-scale empirical study where we systematically evaluate natural language to code techniques. Speci cally, we consider six state-of-the-art techniques targeting code search, and four targeting code generation. Through extensive evaluations on a dataset of 22K+ natural language queries, our study reveals the following major ndings: (1) code search techniques based on model pre-training are so far the most e ective while code generation techniques can also provide promising results; (2) complementarity widely exists among the existing techniques; and (3) combining the ten techniques together can enhance the performance for 35% compared with the most e ective standalone technique. Finally, we propose a post-processing strategy to automatically integrate di erent techniques based on their generated code. Experimental results show that our devised strategy is both e ective and extensible.
Disciplines :
Computer science
Author, co-author :
Wang, Shangwen;  National University of Defense Technology, Changsha, China
Geng, Mingyang;  National University of Defense Technology, Changsha, China
Lin, Bo;  National University of Defense Technology, Changsha, China
Sun, Zhensu;  Singapore Management University, Singapore, Singapore
Wen, Ming;  Huazhong University of Science and Technology, Wuhan, China
Liu, Yepang;  Southern University of Science and Technology, Shenzhen, China
Li, Li;  Beihang University, Beijing, China
BISSYANDE, Tegawendé François d Assise  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > TruX
Mao, Xiaoguang;  National University of Defense Technology, Changsha, China
External co-authors :
yes
Language :
English
Title :
Natural Language to Code: How Far Are We?
Publication date :
30 November 2023
Event name :
31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Event date :
3-9 Décembre 2023
Event number :
31
Audience :
International
Main work title :
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)
Publisher :
ACM, Washington, DC, United States
Pages :
375–387
Peer reviewed :
Peer reviewed
Focus Area :
Security, Reliability and Trust
European Projects :
H2020 - 949014 - NATURAL - Natural Program Repair
Funders :
National Natural Science Foundation of China
European Research Council
Union Européenne
Available on ORBilu :
since 03 December 2023

Statistics


Number of views
103 (1 by Unilu)
Number of downloads
486 (2 by Unilu)

Scopus citations®
 
15
Scopus citations®
without self-citations
7
OpenAlex citations
 
14

Bibliography


Similar publications



Contact ORBilu