[en] A software log is a sequence of log messages generated by log printing statements in the source code. Logs are essential for various software engineering tasks, such as
model inference and anomaly detection, since they are often the only data available that records
the run-time behavior of a software system. However, they cannot be directly processed by log
based analysis techniques that require structured input logs instead of free-formed log messages.
Log parsing aims to address the issue by
decomposing log messages into fixed parts called message templates, characterizing the event types, and variable parts containing the
parameter values of the events, which are determined at run time.
Although, many log parsing techniques have been presented, they have
not been
systematically compared and ranked using different criteria. Additionally, logs
have been used widely in log-based anomaly detection and might affect anomaly detection accuracy; yet, the relationship between log parsing and anomaly detection has not been thoroughly investigated. With the emergence of non-log-parsing-based
anomaly detection techniques that would rule out the impact of log parsing, a comprehensive evaluation to assess which approach is
more suitable for anomaly detection is required.In this thesis we have made the following contributions:
1. We assessed and compared different log parsing techniques and provided
guidelines for evaluating the accuracy of log parsing techniques considering different use
cases.
2. We proposed a theoretical framework for understanding
the relationship between log parsing and anomaly detection,
formally defining the concepts of distinguishability and minimality of ideal log parsing results.
3. We performed a comprehensive empirical
study investigating the impact of log parsing on anomaly
detection accuracy.
4. We performed a comprehensive empirical study comparing
the accuracy and efficiency of log-parsed-based and non-log-parsing-based anomaly detection techniques.
Disciplines :
Computer science
Author, co-author :
KHAN, Zanis Ali ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Language :
English
Title :
On log parsing and log-based anomaly detection: an empirical evaluation
Defense date :
13 November 2023
Institution :
Unilu - University of Luxembourg [The Faculty of Sciences, Technology and Medicine], Luxembourg, Luxembourg
Degree :
Docteur en Informatique (DIP_DOC_0006_B)
Promotor :
BIANCULLI, Domenico ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
President :
PASTORE, Fabrizio ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV ; Unilu - University of Luxembourg [LU]