Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques

KHAN, Zanis Ali; SHIN, Donghwan; BIANCULLI, Domenico; BRIAND, Lionel

doi:10.1145/3510003.3510101

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques

KHAN, Zanis Ali; SHIN, Donghwan; BIANCULLI, Domenico et al.

2022 • In Proceedings of the 44th International Conference on Software Engineering (ICSE ’22)

Peer reviewed

Permalink
https://hdl.handle.net/10993/50072

DOI
10.1145/3510003.3510101

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

icse2022.pdf

Author postprint (625.21 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

logs; template identification; metrics

Abstract :

[en] Log message template identification aims to convert raw logs containing free-formed log messages into structured logs to be processed by automated log-based analysis, such as anomaly detection and model inference. While many techniques have been proposed in the literature, only two recent studies provide a comprehensive evaluation and comparison of the techniques using an established benchmark composed of real-world logs. Nevertheless, we argue that both studies have the following issues: (1) they used different accuracy metrics without comparison between them, (2) some ground-truth (oracle) templates are incorrect, and (3) the accuracy evaluation results do not provide any information regarding incorrectly identified templates. In this paper, we address the above issues by providing three guidelines for assessing the accuracy of log template identification techniques: (1) use appropriate accuracy metrics, (2) perform oracle template correction, and (3) perform analysis of incorrect templates. We then assess the application of such guidelines through a comprehensive evaluation of 14 existing template identification techniques on the established benchmark logs. Results show very different insights than existing studies and in particular a much less optimistic outlook on existing techniques.

Research center :

Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Software Verification and Validation Lab (SVV Lab)

Disciplines :

Computer science

Author, co-author :

KHAN, Zanis Ali ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV

SHIN, Donghwan ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV

BIANCULLI, Domenico ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV

BRIAND, Lionel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV

External co-authors :

Language :

English

Title :

Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques

Publication date :

July 2022

Event name :

44th International Conference on Software Engineering (ICSE ’22)

Event date :

from 21-05-2022 to 29-05-2022

Audience :

International

Main work title :

Proceedings of the 44th International Conference on Software Engineering (ICSE ’22)

Publisher :

ACM, New York, NY, United States

Pages :

1095-1106

Peer review/Selection committee :

Peer reviewed

Focus Area :

Security, Reliability and Trust

Commentary :

The paper presentation can be found here: https://youtu.be/R_bEdohzn6M

Available on ORBilu :

since 27 January 2022

Statistics

Number of views

1081 (97 by Unilu)

Number of downloads

1869 (33 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

H. Dai, H. Li, C. S. Chen, W. Shang, and T. Chen. 2020. Logram: Efficient Log Parsing Using n-Gram Dictionaries. IEEE Transactions on Software Engineering (TSE) (2020), 1-1. https://doi. org/10. 1109/TSE. 2020. 3007554
Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, IEEE, Los Alamitos, CA, USA, 859-864. https://doi. org/10. 1109/CNSM. 2015. 7367331
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. In 2017 ACM Conference on Computer and Communications Security (SIGSAC) (Dallas, Texas, USA) (CCS '17). Association for Computing Machinery, New York, NY, USA, 1285-1298. https://doi. org/10. 1145/3133956. 3134015
Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 IEEE international conference on data mining (ICDM). IEEE, IEEE, Los Alamitos, CA, USA, 149-158. https://doi. org/10. 1109/ICDM. 2009. 60
Hossein Hamooni, Biplob Debnath, Jianwu Xu, Hui Zhang, Guofei Jiang, and Abdullah Mueen. 2016. Logmine: Fast pattern recognition for log analytics. In 25th ACM International on Conference on Information and Knowledge Management (CIKM). Association for Computing Machinery, New York, NY, USA, 1573-1582. https://doi. org/10. 1145/2983323. 2983358
Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE International Conference on Web Services (ICWS). IEEE, IEEE, Los Alamitos, CA, USA, 33-40. https://doi. org/10. 1109/ICWS. 2017. 13
Shilin He, Jieming Zhu, Pinjia He, and Michael R. Lyu. 2020. Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics. arXiv:2008. 06448 [cs. SE] https://arxiv. org/pdf/2008. 06448. pdf
Tong Jia, Lin Yang, Pengfei Chen, Ying Li, Fanjing Meng, and Jingmin Xu. 2017. Logsed: Anomaly diagnosis through mining time-weighted control flow graph in logs. In 2017 IEEE 10th International Conference on Cloud Computing (CLOUD). IEEE, IEEE, Los Alamitos, CA, USA, 447-455. https://doi. org/10. 1109/CLOUD. 2017. 64
Zhen Ming Jiang, Ahmed E Hassan, Parminder Flora, and Gilbert Hamann. 2008. Abstracting execution logs to execution events for enterprise applications. In 2008 The Eighth International Conference on Quality Software (QSIC). IEEE, IEEE, Los Alamitos, CA, USA, 181-186. https://doi. org/10. 1109/QSIC. 2008. 50
Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and Lionel Briand. 2022. Artifact for "Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques". https://doi. org/10. 6084/m9. figshare. 18858332
Adetokunbo AO Makanju, A Nur Zincir-Heywood, and Evangelos E Milios. 2009. Clustering event logs using iterative partitioning. In 15th ACM SIGKDD international conference on Knowledge discovery and data mining (SIGKDD). Association for Computing Machinery, New York, NY, USA, 1255-1264. https: //doi. org/10. 1145/1557019. 1557154
H. B. Mann and D. R. Whitney. 1947. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. The Annals of Mathematical Statistics 18, 1 (1947), 50-60. http://www. jstor. org/stable/2236101
L. Mariani, M. Pezzè, and M. Santoro. 2017. GK-Tail+ An Efficient Approach to Learn Software Models. IEEE Transactions on Software Engineering (TSE) 43, 8 (2017), 715-738. https://doi. org/10. 1109/TSE. 2016. 2623623
Salma Messaoudi, Annibale Panichella, Domenico Bianculli, Lionel Briand, and Raimondas Sasnauskas. 2018. A search-based approach for accurate identification of log message formats. In 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC). ACM, Association for Computing Machinery, New York, NY, USA, 167-16710. https://doi. org/10. 1145/3196321. 3196340
Masayoshi Mizutani. 2013. Incremental mining of system log format. In 2013 IEEE International Conference on Services Computing (SCC). IEEE, IEEE, Los Alamitos, CA, USA, 595-602. https://doi. org/10. 1109/SCC. 2013. 73
Meiyappan Nagappan and Mladen A Vouk. 2010. Abstracting log lines to log event types for mining software system logs. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, IEEE, Los Alamitos, CA, USA, 114-117. https://doi. org/10. 1109/MSR. 2010. 5463281
Animesh Nandi, Atri Mandal, Shubham Atreja, Gargi B. Dasgupta, and Subhrajit Bhattacharya. 2016. Anomaly Detection Using Program Control Flow Graph Mining From Execution Logs. In 22nd ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD) (San Francisco, California, USA) (KDD '16). Association for Computing Machinery (ACM), New York, NY, USA, 215-224. https://doi. org/10. 1145/2939672. 2939712
Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. 2020. Self-Supervised Log Parsing. In Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track-European Conference, ECML PKDD 2020, Proceedings, Part IV (Lecture Notes in Computer Science, Vol. 12460). Springer, Cham, 122-138.
Keiichi Shima. 2016. Length Matters: Clustering System Log Messages using Length of Words. arXiv:1611. 03213 [cs. OH] https://arxiv. org/abs/1611. 03213
Liang Tang, Tao Li, and Chang-Shing Perng. 2011. LogSig: Generating system events from raw textual logs. In 20th ACM international conference on Information and knowledge management (CIKM). ACM, New York, NY, USA, 785-794. https: //doi. org/10. 1145/2063576. 2063690
Risto Vaarandi. 2003. A data clustering algorithm for mining patterns from event logs. In 3rd IEEE Workshop on IP Operations & Management (IPOM). IEEE, IEEE, Los Alamitos, CA, USA, 119-126. https://doi. org/10. 1109/IPOM. 2003. 1251233
R. Vaarandi and M. Pihelgas. 2015. LogCluster-A data clustering and pattern mining algorithm for event logs. In 2015 11th International Conference on Network and Service Management (CNSM). IEEE, Los Alamitos, CA, USA, 1-7. https: //doi. org/10. 1109/CNSM. 2015. 7367331
S. Varrette, P. Bouvry, H. Cartiaux, and F. Georgatos. 2014. Management of an Academic HPC Cluster: The UL Experience. In Proc. of the 2014 Intl. Conf. on High Performance Computing & Simulation (HPCS 2014). IEEE, Los Alamitos, CA, USA, 959-967.
Neil Walkinshaw, Ramsay Taylor, and John Derrick. 2016. Inferring extended finite state machine models from software executions. Empirical Software Engineering 21, 3 (01 Jun 2016), 811-853. https://doi. org/10. 1007/s10664-015-9367-7
Frank Wilcoxon. 1992. Individual Comparisons by Ranking Methods. Springer New York, New York, NY, 196-202. https://doi. org/10. 1007/978-1-4612-4380-9_16
Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, IEEE, Los Alamitos, CA, USA, 121-130. https://doi. org/10. 1109/ICSE-SEIP. 2019. 00021