No full text
Eprint already available on another site (E-prints, Working papers and Research blog)
CodeAD: Synthesize Code of Rules for Log-based Anomaly Detection with LLMs
Huang, Junjie; He, Minghua; Liu, Jinyang et al.
2025
 

Files


Full Text
No document available.

Send to



Details



Keywords :
Computer Science - Software Engineering; Computer Science - Distributed; Parallel; and Cluster Computing; Computer Science - Multiagent Systems
Abstract :
[en] Log-based anomaly detection (LogAD) is critical for maintaining the reliability and availability of large-scale online service systems. While machine learning, deep learning, and large language models (LLMs)-based methods have advanced the LogAD, they often suffer from limited interpretability, high inference costs, and extensive preprocessing requirements, limiting their practicality for real-time, high-volume log analysis. In contrast, rule-based systems offer efficiency and transparency, but require significant manual effort and are difficult to scale across diverse and evolving environments. In this paper, We present CodeAD, a novel framework that automatically synthesizes lightweight Python rule functions for LogAD using LLMs. CodeAD introduces a hierarchical clustering and anchor-grounded sampling strategy to construct representative contrastive log windows, enabling LLMs to discern discriminative anomaly patterns. To ensure robustness and generalizability, CodeAD employs an agentic workflow that iteratively generates, tests, repairs, and refines the rules until it meets correctness and abstraction requirements. The synthesized rules are interpretable, lightweight, and directly executable on raw logs, supporting efficient and transparent online anomaly detection. Our comprehensive experiments on three public datasets (BGL, Hadoop, Thunderbird) demonstrate that CodeAD achieves an average absolute improvement of 3.6% F1 score over the state-of-the-art baselines, while processing large datasets up to 4x faster and at a fraction of the cost (total LLM invocation cost under 4 USD per dataset). These results highlight CodeAD as a practical and scalable solution for online monitoring systems, enabling interpretable, efficient, and automated LogAD in real-world environment.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SVV - Software Verification and Validation
Disciplines :
Computer science
Author, co-author :
Huang, Junjie
He, Minghua
Liu, Jinyang
Huo, Yintong
BIANCULLI, Domenico  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SVV
Lyu, Michael R.
Language :
English
Title :
CodeAD: Synthesize Code of Rules for Log-based Anomaly Detection with LLMs
Publication date :
October 2025
Focus Area :
Security, Reliability and Trust
FnR Project :
FNR17373407 - LOGODOR - Automated Log Smell Detection And Removal, 2022 (01/09/2023-31/08/2026) - Domenico Bianculli
Available on ORBilu :
since 05 January 2026

Statistics


Number of views
32 (5 by Unilu)
Number of downloads
0 (0 by Unilu)

Bibliography


Similar publications



Contact ORBilu