Reference : A Search-based Approach for Accurate Identification of Log Message Formats
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Security, Reliability and Trust
http://hdl.handle.net/10993/35286
A Search-based Approach for Accurate Identification of Log Message Formats
English
Messaoudi, Salma mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Panichella, Annibale mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Bianculli, Domenico mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Briand, Lionel mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Sasnauskas, Raimondas [Société Européenne des Satellites - SES]
2018
Proceedings of the 26th IEEE/ACM International Conference on Program Comprehension (ICPC ’18)
ACM
Yes
No
International
26th IEEE/ACM International Conference on Program Comprehension (ICPC ’18)
from 27-05–2018 to 28-05-2018
IEEE/ACM
Gothenburg
Sweden
[en] log parsing ; log analysis ; log message format ; NSGA-II
[en] Many software engineering activities process the events contained in log files. However, before performing any processing activity, it is necessary to parse the entries in a log file, to retrieve the actual events recorded in the log. Each event is denoted by a log message,
which is composed of a fixed part-called (event) template-that is the same for all occurrences of the same event type, and a variable part, which may vary with each event occurrence. The formats of log messages, in complex and evolving systems, have numerous variations, are typically not entirely known, and change on a frequent basis; therefore, they need to be identified automatically.

The log message format identification problem deals with the identification of the different templates used in the messages of a log. Any solution to this problem has to generate templates that meet two main goals: generating templates that are not too general, so as to distinguish different events, but also not too specific, so as not to consider different occurrences of the same event as following different templates; however, these goals are conflicting.

In this paper, we present the MoLFI approach, which recasts the log message identification problem as a multi-objective problem. MoLFI uses an evolutionary approach to solve this problem, by tailoring the NSGA-II algorithm to search the space of solutions for a Pareto optimal set of message templates. We have implemented MoLFI in a tool, which we have evaluated on six real-world datasets, containing log files with a number of entries ranging from 2K to 300K. The experiments results show that MoLFI extracts by far the highest number of correct log message templates, significantly outperforming two state-of-the-art approaches on all datasets.
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab)
Fonds National de la Recherche - FnR ; European Commission - EC
Researchers ; Professionals ; Students ; General public ; Others
http://hdl.handle.net/10993/35286
H2020 ; 694277 - TUNE - Testing the Untestable: Model Testing of Complex Software-Intensive Systems
FnR ; FNR11602677 > Lionel Briand > LISTENER > Log-drIven, Search-based TEst geNERation for Ground Control Systems > 01/01/2018 > 31/12/2020 > 2017

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
ICPC-2018.pdfAuthor postprint741.78 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.