Abstract :
[en] Log message template identification aims to convert raw logs
containing free-formed log messages into structured logs to be
processed by automated log-based analysis, such as anomaly detection
and model inference. While many techniques have been proposed in the
literature, only two recent studies provide a comprehensive evaluation
and comparison of the techniques using an established benchmark
composed of real-world logs. Nevertheless, we argue that both studies
have the following issues: (1) they used different accuracy metrics
without comparison between them, (2) some ground-truth (oracle)
templates are incorrect, and (3) the accuracy evaluation
results do not provide any information regarding incorrectly
identified templates.
In this paper, we address the above issues by providing three
guidelines for assessing the accuracy of log template identification
techniques: (1) use appropriate accuracy metrics, (2) perform oracle
template correction, and (3) perform analysis of incorrect templates. We then
assess the application of such guidelines through a comprehensive
evaluation of 14 existing template identification techniques on the
established benchmark logs. Results show very different insights than
existing studies and in particular a much less optimistic outlook on
existing techniques.
Publisher :
ACM, New York, NY, United States
Scopus citations®
without self-citations
26