Article (Scientific journals)
Automatic Identifier Inconsistency Detection Using Code Dictionary
Kim, Suntae; Kim, Dongsun
2016In Empirical Software Engineering, 21 (2), p. 565-604
Peer Reviewed verified by ORBi
 

Files


Full Text
EMSE-D-14-00036.pdf
Author preprint (4.93 MB)
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Abstract :
[en] Inconsistent identifiers make it difficult for developers to understand source code. In particular, large software systems written by several developers can be vulnerable to identifier inconsistency. Unfortunately, it is not easy to detect inconsistent identifiers that are already used in source code. Although several techniques have been proposed to address this issue, many of these techniques can result in false alarms since such techniques do not accept domain words and idiom identifiers that are widely used in programming practice. This paper proposes an approach to detecting inconsistent identifiers based on a custom code dictionary. It first automatically builds a Code Dictionary from the existing API documents of popular Java projects by using an Natural Language Processing (NLP) parser. This dictionary records domain words with dominant part-of-speech (POS) and idiom identifiers. This set of domain words and idioms can improve the accuracy when detecting inconsistencies by reducing false alarms. The approach then takes a target program and detects inconsistent identifiers of the program by leveraging the Code Dictionary. We provide CodeAmigo, a GUI-based tool support for our approach. We evaluated our approach on seven Java based open-/proprietarysource projects. The results of the evaluations show that the approach can detect inconsistent identifiers with 85.4% precision and 83.59% recall values. In addition, we conducted an interview with developers who used our approach, and the interview confirmed that inconsistent identifiers frequently and inevitably occur in most software projects. The interviewees then stated that our approach can help to better detect inconsistent identifiers that would have been missed through manual detection.
Disciplines :
Computer science
Author, co-author :
Kim, Suntae 
Kim, Dongsun  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
 These authors have contributed equally to this work.
External co-authors :
yes
Language :
English
Title :
Automatic Identifier Inconsistency Detection Using Code Dictionary
Publication date :
March 2016
Journal title :
Empirical Software Engineering
ISSN :
1573-7616
Publisher :
Springer Science & Business Media B.V.
Volume :
21
Issue :
2
Pages :
565-604
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBilu :
since 25 February 2015

Statistics


Number of views
201 (10 by Unilu)
Number of downloads
570 (5 by Unilu)

Scopus citations®
 
31
Scopus citations®
without self-citations
26
OpenCitations
 
15
WoS citations
 
27

Bibliography


Similar publications



Contact ORBilu