TUNA: TUning Naturalness-based Analysis

JIMENEZ, Matthieu; CORDY, Maxime; LE TRAON, Yves; PAPADAKIS, Mike

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

JIMENEZ, Matthieu; CORDY, Maxime; LE TRAON, Yves et al.

2018 • In 34th IEEE International Conference on Software Maintenance and Evolution, Madrid, Spain, 26-28 September 2018

Peer reviewed

Permalink
https://hdl.handle.net/10993/36136

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

tuna2.pdf

Author preprint (33.91 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Artifact; Naturalness; Source Code Analysis

Abstract :

[en] Natural language processing techniques, in particular n-gram models, have been applied successfully to facilitate a number of software engineering tasks. However, in our related ICSME ’18 paper, we have shown that the conclusions of a study can drastically change with respect to how the code is tokenized and how the used n-gram model is parameterized. These choices are thus of utmost importance, and one must carefully make them. To show this and allow the community to benefit from our work, we have developed TUNA (TUning Naturalness-based Analysis), a Java software artifact to perform naturalness-based analyses of source code. To the best of our knowledge, TUNA is the first open- source, end-to-end toolchain to carry out source code analyses based on naturalness.

Research center :

Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Security Design and Validation Research Group (SerVal)

Disciplines :

Computer science

Author, co-author :

JIMENEZ, Matthieu ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)

CORDY, Maxime ; Facultés Universitaires Notre-Dame de la Paix - Namur - FUNDP

LE TRAON, Yves ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)

PAPADAKIS, Mike ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)

External co-authors :

yes

Language :

English

Title :

TUNA: TUning Naturalness-based Analysis

Publication date :

26 September 2018

Event name :

34th IEEE International Conference on Software Maintenance and Evolution (ICSME'18)

Event place :

Madrid, Spain

Event date :

26-28 September 2018

Audience :

International

Main work title :

34th IEEE International Conference on Software Maintenance and Evolution, Madrid, Spain, 26-28 September 2018

Peer reviewed :

Peer reviewed

Available on ORBilu :

since 12 July 2018

Statistics

Number of views

289 (10 by Unilu)

Number of downloads

309 (6 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, "On the naturalness of software, " in Proceedings of ICSE 12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 837-847.
B. Ray, V. Hellendoorn, S. Godhane, Z. Tu, A. Bacchelli, and P. Devanbu, "On the "naturalness" of buggy code, " in Proceedings of ICSE 16. New York, NY, USA: ACM, 2016, pp. 428-439.
M. Allamanis, E. T. Barr, P. Devanbu, and C. Sutton, "A survey of machine learning for big code and naturalness, " CoRR, vol. abs/1709.06182, 2017.
M. Jimenez, M. Cordy, Y. L. Traon, and M. Papadakis, "On the impact of tokenizer and parameters on n-gram based code analysis, " in Proceedings of ICSME 18, 2018.
J. Parser. (2017) Java parser github. [Online]. Available: https://github.com/javaparser/javaparser
G. Neubig. (2017) Kyoto language modeling toolkit. [Online]. Available: https://github.com/neubig/kylm