Aerial robots; Control challenges; Deterministics; Hazardous environment; Large-scales; Material deposition; Policy gradient; Re-tuning; Real-time control; Reinforcement learnings; Modeling and Simulation; Aerospace Engineering; Control and Optimization
Abstract :
[en] This paper investigates the application of Deep Reinforcement Learning (DRL) to address motion control challenges in drones for additive manufacturing (AM). Dronebased additive manufacturing offers a flexible and autonomous solution for material deposition in large-scale or hazardous environments. However, achieving robust real-time control of a multi-rotor aerial robot under varying payloads and potential disturbances remains challenging. Traditional controllers like PID often require frequent parameter re-tuning, limiting their applicability in dynamic scenarios. We propose a DRL framework that learns adaptable control policies for multirotor drones performing waypoint navigation in AM tasks. We compare Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) within a curriculum learning scheme designed to handle increasing complexity. Our experiments show TD3 consistently balances training stability, accuracy and success, particularly when mass variability is introduced. These findings provide a scalable path toward robust, autonomous drone control in additive manufacturing.
Disciplines :
Aerospace & aeronautics engineering
Author, co-author :
SHETTY, Gaurav ; University of Luxembourg ; Bonn-Rhein-Sieg University of Applied Sciences, Germany
HABIBI, Hamed ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > Automation > Team Holger VOOS
VOOS, Holger ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
Sanchez-Lopez, Jose Luis; Automation and Robotics Research Group, Interdisciplinary Centre for Security, Reliability, and Trust (SnT), University of Luxembourg, Faculty of Science, Technology and Medicine, Luxembourg
External co-authors :
yes
Language :
English
Title :
Motion Control in Multi-Rotor Aerial Robots Using Deep Reinforcement Learning
Publication date :
May 2025
Event name :
2025 International Conference on Unmanned Aircraft Systems (ICUAS)
Event place :
Charlotte, Usa
Event date :
14-05-2025 => 17-05-2025
Main work title :
2025 International Conference on Unmanned Aircraft Systems, ICUAS 2025
Publisher :
Institute of Electrical and Electronics Engineers Inc.
S. A. H. Mohsan, M. A. Khan, F. Noor, I. Ullah, and M. H. Alsharif, "Towards the unmanned aerial vehicles (UAVs): A comprehensive review, " Drones, vol. 6, no. 6, p. 147, 2022.
K. Aghaee, L. Li, A. Roshan, and P. Namakiaraghi, "Additive manufacturing evolution in construction: From individual terrestrial to collective, aerial, and extraterrestrial applications, " Journal of Building Engineering, p. 110389, 2024.
M.-N. Stamatopoulos, A. Banerjee, and G. Nikolakopoulos, "Collaborative Aerial 3D Printing: Leveraging UAV Flexibility and Mesh Decomposition for Aerial Swarm-Based Construction, " in 2024 International Conference on Unmanned Aircraft Systems (ICUAS), 2024: IEEE, pp. 45-52.
S. A. H. Mohsan, N. Q. H. Othman, Y. Li, M. H. Alsharif, and M. A. Khan, "Unmanned aerial vehicles (UAVs): Practical aspects, applications, open challenges, security issues, and future trends, " Intelligent Service Robotics, vol. 16, no. 1, pp. 109-137, 2023.
P. Chermprayong, "Enabling Technologies for Precise Aerial Manufacturing with Unmanned Aerial Vehicles, " Imperial College London, 2019.
M. A. Alandihallaj, M. Ramezani, and A. M. Hein, "MBSE-Enhanced LSTM Framework for Satellite System Reliability and Failure Prediction, " in Proceedings of the 12th International Conference on Model-Based Software and Systems Engineering (MODELSWARD), 2024, pp. 349-356. [Online]. Available: Https: //www. scitepress. org/ Papers/2024/126076/126076. pdf
L. G. Kraft and D. P. Campagna, "A comparison between CMAC neural network control and two traditional adaptive control systems, " IEEE Control Systems Magazine, vol. 10, no. 3, pp. 36-43, 1990.
M. A. Alandihallaj, N. Assadian, and K. Khorasani, "Stochastic model predictive control-based countermeasure methodology for satellites against indirect kinetic cyber-attacks, " International Journal of Control, vol. 96, no. 7, pp. 1895-1908, 2023.
C. Patchett, "On the derivation and analysis of decision architectures for unmanned aircraft systems, " 2013.
M. A. A. Hallaj and N. Assadian, "Sliding mode control of electromagnetic tethered satellite formation, " Advances in Space Research, vol. 58, no. 4, pp. 619-634, 2016.
M. Ramezani and M. Amiri Atashgah, "Energy-Aware Hierarchical Reinforcement Learning Based on the Predictive Energy Consumption Algorithm for Search and Rescue Aerial Robots in Unknown Environments, " Drones, vol. 8, no. 7, p. 283, 2024.
M. Ramezani, M. A. Alandihallaj, and A. M. Hein, "PPO-Based Dynamic Control of Uncertain Floating Platforms in Zero-G Environment, " in 2024 IEEE International Conference on Robotics and Automation (ICRA), 2024: IEEE, pp. 11730-11736.
M. Ramezani, M. A. Alandihallaj, and A. M. Hein, "Fuel-Efficient and Fault-Tolerant CubeSat Orbit Correction via Machine Learning-Based Adaptive Control, " Aerospace, vol. 11, no. 1, p. 1, 2024. [Online]. Available: Https: //www. mdpi. com/2226-4310/11/1/1
M. Ramezani, H. Habibi, J. L. Sanchez-Lopez, and H. Voos, "UAV path planning employing MPC-reinforcement learning method considering collision avoidance, " in 2023 International Conference on Unmanned Aircraft Systems (ICUAS), 2023: IEEE, pp. 507-514.
M. Ramezani, M. Amiri Atashgah, and A. Rezaee, "A Fault-Tolerant Multi-Agent Reinforcement Learning Framework for Unmanned Aerial Vehicles-Unmanned Ground Vehicle Coverage Path Planning, " Drones, vol. 8, no. 10, p. 537, 2024.
H. Tan, "Reinforcement learning with deep deterministic policy gradient, " in 2021 International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), 2021: IEEE, pp. 82-85.
K. Teixeira, G. Miguel, H. S. Silva, and F. Madeiro, "A survey on applications of unmanned aerial vehicles using machine learning, " IEEE Access, 2023.
Z. Song et al., "From deterministic to stochastic: An interpretable stochastic model-free reinforcement learning framework for portfolio optimization, " Applied Intelligence, vol. 53, no. 12, pp. 15188-15203, 2023.
M. Ramezani, M. Atashgah, J. L. Sanchez-Lopez, and H. Voos, "Human-centric aware UAV trajectory planning in search and rescue missions employing multi-objective reinforcement learning with AHP and similarity-based experience replay, " in 2024 International Conference on Unmanned Aircraft Systems (ICUAS), 2024: IEEE, pp. 177-184.
Y. Bai, H. Zhao, X. Zhang, Z. Chang, R. Jäntti, and K. Yang, "Towards autonomous multi-UAV wireless network: A survey of reinforcement learning-based approaches, " IEEE Communications Surveys & Tutorials, 2023.
H. Kurunathan, H. Huang, K. Li, W. Ni, and E. Hossain, "Machine learning-aided operations and communications of unmanned aerial vehicles: A contemporary survey, " IEEE Communications Surveys & Tutorials, 2023.
X. Wang, Y. Chen, and W. Zhu, "A survey on curriculum learning, " IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4555-4576, 2021.
N. M. B. Lakhal, L. Adouane, O. Nasri, and J. B. H. Slama, "Safe and adaptive autonomous navigation under uncertainty based on sequential waypoints and reachability analysis, " Robotics and Autonomous Systems, vol. 152, p. 104065, 2022.
M. L. Puterman, "Markov decision processes, " in Handbooks in Operations Research and Management Science, vol. 2, pp. 331-434, 1990.
Mellinger, Daniel, and Nathan Michael. "Trajectory Generation and Control for Precise Aggressive Maneuvers with Quadrotors. " The International Journal of Robotics Research, 2012, pp. 664-674.
K. Zhang, P. Chermprayong, F. Xiao, et al., "Aerial additive manufacturing with multiple autonomous robots, " Nature, vol. 609, pp. 709-717, 2022.
M. Mutti, R. De Santi, P. De Bartolomeis, and M. Restelli, "Challenging common assumptions in convex reinforcement learning, " Advances in Neural Information Processing Systems, vol. 35, pp. 4489-4502, 2022.
J. Sun, "Pulse-width modulation, " in Dynamics and Control of Switched Electronic Systems: Advanced Perspectives for Modeling, Simulation and Control of Power Converters, Springer, 2012, pp. 25-61.
Y. Hou, L. Liu, Q. Wei, X. Xu, and C. Chen, "A novel DDPG method with prioritized experience replay, " in 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2017: IEEE, pp. 316-321.
Y. Ye, D. Qiu, H. Wang, Y. Tang, and G. Strbac, "Real-time autonomous residential demand response management based on twin delayed deep deterministic policy gradient learning, " Energies, vol. 14, no. 3, p. 531, 2021.
L.-J. Lin, "Self-improving reactive agents based on reinforcement learning, planning, and teaching, " Machine Learning, vol. 8, no. 3-4, pp. 293-321, 1992.
Leonid Datta, A Survey on Activation Functions and their relation with Xavier and He Normal Initialization, arXiv, 2020, https: //arxiv. org/abs/2004. 06632, doi: 10. 48550/ARXIV. 2004. 06632.