ChaosLLM: A Dependability Testing Approach for Tool-Calling Agents

IANNILLO, Antonio Ken

doi:10.1109/issrew67781.2025.00085

Download

Article (Scientific journals)

ChaosLLM: A Dependability Testing Approach for Tool-Calling Agents

IANNILLO, Antonio Ken

2025 • In 2025 IEEE 36th International Symposium on Software Reliability Engineering Workshops (ISSREW), p. 282-285

Peer reviewed

Permalink
https://hdl.handle.net/10993/67676

DOI
10.1109/issrew67781.2025.00085

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

ISSRE2025_Iannillo.pdf

Author postprint (191.24 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Precision for document type :

Review article

Disciplines :

Computer science

Author, co-author :

IANNILLO, Antonio Ken ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN

External co-authors :

Language :

English

Title :

ChaosLLM: A Dependability Testing Approach for Tool-Calling Agents

Publication date :

21 October 2025

Journal title :

2025 IEEE 36th International Symposium on Software Reliability Engineering Workshops (ISSREW)

Publisher :

IEEE

Pages :

282-285

Peer reviewed :

Peer reviewed

Additional URL :

http://xplorestaging.ieee.org/ielx8/11262267/11262260/11262344.pdf?arnumber=11262344

Available on ORBilu :

since 04 February 2026

Statistics

Number of views

54 (3 by Unilu)

Number of downloads

39 (2 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

K. El Haji, C. Brandt, and A. Zaidman, "Using github copilot for test generation in python: An empirical study, " in Proceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024), 2024, pp. 45-55.
M. Hu, P. Zhao, C. Xu, Q. Sun, J.-G. Lou, Q. Lin, P. Luo, and S. Rajmohan, "Agentgen: Enhancing planning abilities for large language model based agent via environment and task generation, " in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1, 2025, pp. 496-507.
F. Altiero, D. Cotroneo, R. De Luca, and P. Liguori, "Securing ai code generation through automated pattern-based patching, " in 2025 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). IEEE, 2025, pp. 282-289.
A. Al-Said Ahmad, L. F. Al-Qora?n, and A. Zayed, "Exploring the impact of chaos engineering with various user loads on cloud native applications: An exploratory empirical study, " Computing, vol. 106, no. 7, pp. 2389-2425, 2024.
R. Natella, D. Cotroneo, and H. S. Madeira, "Assessing dependability with software fault injection: A survey, " ACM Computing Surveys (CSUR), vol. 48, no. 3, pp. 1-55, 2016.
G. Yu, G. Tan, H. Huang, Z. Zhang, P. Chen, R. Natella, Z. Zheng, and M. R. Lyu, "A survey on failure analysis and fault injection in ai systems, " ACM Transactions on Software Engineering and Methodology, 2024.
J. Yu, Y. Shao, H. Miao, and J. Shi, "Promptfuzz: Harnessing fuzzing techniques for robust testing of prompt injection in llms, " arXiv preprint arXiv:2409.14729, 2024.
Z. Wang, V. Siu, Z. Ye, T. Shi, Y. Nie, X. Zhao, C. Wang, W. Guo, and D. Song, "Agentxploit: End-To-end redteaming of black-box ai agents, " arXiv e-prints, pp. arXiv-2505, 2025.
M. A. Chang, B. Tschaen, T. Benson, and L. Vanbever, "Chaos monkey: Increasing sdn reliability through systematic network destruction, " in Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication, 2015, pp. 371-372.
T. Sharma, "Llms for code: The potential, prospects, and problems, " in 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C). IEEE, 2024, pp. 373-374.
M. Dunne, K. Schram, and S. Fischmeister, "Weaknesses in llmgenerated code for embedded systems networking, " in 2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2024, pp. 250-261.
L. Zhong and Z. Wang, "Can llm replace stack overflow? a study on robustness and reliability of large language model code generation, " in Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 19, 2024, pp. 21 841-21 849.
B. Yetistiren, I. Ozsoy, and E. Tuzun, "Assessing the quality of github copilot?s code generation, " in Proceedings of the 18th international conference on predictive models and data analytics in software engineering, 2022, pp. 62-71.
W. Hua, X. Yang, M. Jin, Z. Li, W. Cheng, R. Tang, and Y. Zhang, "Trustagent: Towards safe and trustworthy llm-based agents through agent constitution, " in Trustworthy Multi-modal Foundation Models and AI Agents (TiFA), 2024.