Large Language Models; Behavior Tree; Human-Robot Interaction; Robotics
Abstract :
[en] As intelligent robots become more integrated into human environments, there is a growing need for intuitive and reliable Human-Robot Interaction (HRI) interfaces that are adaptable and more natural to interact with. Traditional robot control methods often require users to adapt to interfaces or memorize predefined commands, limiting usability in dynamic, unstructured environments. This paper presents a novel framework that bridges natural language understanding and robotic execution by combining Large Language Models (LLMs) with Behavior Trees. This integration enables robots to interpret natural language instructions given by users and translate them into executable actions by activating domain-specific plugins. The system supports scalable and modular integration, with a primary focus on perception-based functionalities, such as person tracking and hand gesture recognition. To evaluate the system, a series of real-world experiments was conducted across diverse environments. Experimental results demonstrate that the proposed approach is practical in real-world scenarios, with an average cognition-to-execution accuracy of approximately 94%, making a significant contribution to HRI systems and robots.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > ARG - Automation & Robotics
Disciplines :
Computer science
Author, co-author :
CHEKAM, Ingrid Maéva ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
Pastor-Martinez, Ines; Interdisciplinary Centre for Security, Reliability, and Trust (SnT), Automation and Robotics Research Group (ARG), University of Luxembourg, Luxembourg
TOURANI, Ali ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
MILLÁN ROMERA, José Andrés ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
RIBEIRO, Laura ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
BASTOS SOARES, Pedro Miguel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > Automation > Team Jose Luis SANCHEZ LOPEZ
FNR17387634 - DEUS - Deep Understanding Of The Situation For Robots, 2022 (01/09/2023-31/08/2026) - Jose-luis Sanchez-lopez FNR17097684 - RoboSAUR - Robotic Situational Awareness By Understanding And Reasoning, 2022 (15/09/2022-14/09/2026) - José Andrés Millán Romera
Funders :
Fonds National de la Recherche Luxembourg Institute of Advanced Studies (IAS), University of Luxembourg
Funding text :
This research was funded, in part, by the Luxembourg National Research Fund (FNR), DEUS Project (Ref. C22/IS/17387634/DEUS), RoboSAUR Project (Ref. 17097684/RoboSAUR), and MR-Cobot Project (Ref. 18883697/MR-Cobot). It was also partially funded by the Institute of Advanced Studies (IAS) of the University of Luxembourg through an “Audacity” grant (project TRANSCEND - 2021). For the purpose of open access, and in fulfillment of the obligations arising from the grant agreement, the author has applied a Creative Commons Attribution 4.0 International (CC BY 4.0) license to any Author Accepted Manuscript version arising from this submission.
T. Wang, P. Zheng, S. Li, and L. Wang, “Multimodal human-robot interaction for human-centric smart manufacturing: a survey,” Advanced Intelligent Systems, vol. 6, no. 3, p. 2300359, 2024.
R. A. S. Fernandez, J. L. Sanchez-Lopez, C. Sampedro, H. Bavle, M. Molina, and P. Campoy, “Natural user interfaces for human-drone multi-modal interaction,” in 2016 International Conference on Unmanned Aircraft Systems (ICUAS). IEEE, 2016, pp. 1013–1022.
B. Obrenovic, X. Gu, G. Wang, D. Godinic, and I. Jakhongirov, “Generative ai and human-robot interaction: implications and future agenda for business, society and ethics,” AI & society, pp. 1–14, 2024.
C. Y. Kim, C. P. Lee, and B. Mutlu, “Understanding large-language model (llm)-powered human-robot interaction,” in Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 2024, pp. 371–380.
Y. Wu, Y. Tao, P. Li, G. Shi, G. S. Sukhatmem, V. Kumar, and L. Zhou, “Hierarchical llms in-the-loop optimization for real-time multi-robot target tracking under unknown hazards,” arXiv preprint arXiv:2409.12274, 2024.
W. Zu, W. Song, R. Chen, Z. Guo, F. Sun, Z. Tian, W. Pan, and J. Wang, “Language and sketching: An llm-driven interactive multimodal multitask robot navigation framework,” in 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 1019–1025.
W. Zhao, L. Li, H. Zhan, Y. Wang, and Y. Fu, “Applying large language model to a control system for multi-robot task assignment,” Drones, vol. 8, no. 12, p. 728, 2024.
C. Wang, S. Hasler, D. Tanneberg, F. Ocker, F. Joublin, A. Ceravola, J. Deigmoeller, and M. Gienger, “Lami: Large language models for multi-modal human-robot interaction,” in Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–10.
M. Iovino, J. Förster, P. Falco, J. J. Chung, R. Siegwart, and C. Smith, “On the programming effort required to generate behavior trees and finite state machines for robotic applications,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 5807–5813.
Y. Cheng, L. Sun, and M. Tomizuka, “Human-aware robot task planning based on a hierarchical task model,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1136–1143, 2021.
L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, no. 1, pp. 411–444, 2022.
M. Colledanchise and P. Ögren, Behavior trees in robotics and AI: An introduction. CRC Press, 2018.
A. A. Kareem, D. A. Hammood, R. A. Khamees, and N. B. H. Ismail, “Object tracking with the drone: Systems analysis,” Journal of Techniques, vol. 5, no. 2, pp. 89–94, 2023.
J. Qi, L. Ma, Z. Cui, and Y. Yu, “Computer vision-based hand gesture recognition for human-robot interaction: a review,” Complex & Intelligent Systems, vol. 10, no. 1, pp. 1581–1606, 2024.
A. Koubaa, A. Ammar, and W. Boulila, “Next-generation human-robot interaction with chatgpt and robot operating system,” Software: Practice and Experience, vol. 55, no. 2, pp. 355–382, 2025.
C. E. Mower, Y. Wan, H. Yu, A. Grosnit, J. Gonzalez-Billandon, M. Zimmer, J. Wang, X. Zhang, Y. Zhao, A. Zhai et al., “Ros-llm: A ros framework for embodied ai with task feedback and structured reasoning,” arXiv preprint arXiv:2406.19741, 2024.
G. Sarch, Y. Wu, M. J. Tarr, and K. Fragkiadaki, “Open-ended instructable embodied agents with memory-augmented large language models,” in Findings of the Association for Computational Linguistics: EMNLP 2023, 2023, pp. 3468–3500.
N. J. P. Laboratory, “ROSA: Reasoning-observation-action framework,” https://github.com/nasa-jpl/rosa, 2023, accessed: June 2025.
B. Benjdira, A. Koubaa, and A. M. Ali, “Prompting robotic modalities (prm): A structured architecture for centralizing language models in complex systems,” Future Generation Computer Systems, vol. 166, p. 107723, 2025.
C. Tagliamonte, D. Maccaline, G. LeMasurier, and H. A. Yanco, “A generalizable architecture for explaining robot failures using behavior trees and large language models,” in Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction, 2024, pp. 1038–1042.
A. Lykov and D. Tsetserukou, “Llm-brain: Ai-driven fast generation of robot behaviour tree based on large language model,” in 2024 2nd International Conference on Foundation and Large Language Models (FLLM). IEEE, 2024, pp. 392–397.
R. A. Izzo, G. Bardaro, and M. Matteucci, “Btgenbot: Behavior tree generation for robotic tasks with lightweight llms,” in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 9684–9690.
Z. Yang and Z. Jia, “Robot behavior tree manipulation using language models,” in 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), vol. 11, 2023, pp. 1342–1345.
J. Styrud, M. Iovino, M. Norrlöf, M. Björkman, and C. Smith, “Automatic behavior tree expansion with llms for robotic manipulation,” in 2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 1225–1232.
H. Zhou, Y. Lin, L. Yan, J. Zhu, and H. Min, “Llm-bt: Performing robotic adaptive tasks based on large language models and behavior trees,” in 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 16 655–16 661.
J. Liang, Y. Chang, Q. Wang, Y. Wang, and X. Yi, “Diagbt: An explainable and evolvable robot control framework using dialogue generative behavior trees,” in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 7056–7062.
G. Jocher and J. Qiu, “Ultralytics yolo11,” 2024. [Online]. Available: https://github.com/ultralytics/ultralytics