[en] In this paper we apply deep reinforcement learning techniques on a multicopter for learning a stable hovering task in
a continuous action state environment. We present a framework based on OpenAI GYM, Gazebo and RotorS MAV simulator, utilized for successfully training different agents to perform various tasks. The deep reinforcement learning method used for the training is model-free, on-policy, actor-critic based algorithm called Trust Region Policy Optimization (TRPO). Two neural networks have been used as a nonlinear function approximators. Our experiments showed that such learning approach achieves successful results, and facilitates the process of controller design.
Research center :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Automation & Robotics Research Group
Disciplines :
Computer science
Author, co-author :
MANUKYAN, Anush ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
OLIVARES MENDEZ, Miguel Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)
Geist, Matthieu
VOOS, Holger ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Engineering Research Unit
External co-authors :
yes
Language :
English
Title :
Deep Reinforcement Learning based Continuous Control for Multicopter Systems
Publication date :
26 April 2019
Event name :
International Conference on Control, Decision and Information CoDIT
Event place :
Paris, France
Event date :
from 23-04-2019 to 26-04-2019
Audience :
International
Main work title :
International Conference on Control, Decision and Information CoDIT, Paris 23-26 April 2019
Zulu, Andrew, and Samuel John. "A review of control algorithms for autonomous quadrotors. " arXiv preprint arXiv:1602. 02622(2016).
Kober, Jens, J. Andrew Bagnell, and Jan Peters. "Reinforcement learning in robotics: A survey. " The International Journal of Robotics Research 32. 11 (2013): 1238-1274.
Polvara, Riccardo, et al. "Autonomous Quadrotor Landing using Deep Reinforcement Learning. " arXiv preprint arXiv:1709. 03339 (2017).
Koch, William, et al. "Reinforcement Learning for UAV Attitude Control. " arXiv preprint arXiv:1804. 04154 (2018).
Bou-Ammar, Haitham, Holger Voos, and Wolfgang Ertel. "Controller design for quadrotor uavs using reinforcement learning. " Control Applications (CCA), 2010 IEEE International Conference on. IEEE, 2010.
Pham, Huy Xuan, et al. "Reinforcement Learning for Autonomous UAV Navigation Using Function Approximation. " 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR). IEEE, 2018.
Pham, Huy X., et al. "Autonomous uav navigation using reinforcement learning. " arXiv preprint arXiv:1801. 05086 (2018).
Sugimoto, Takuya, and Manabu Gouko. "Acquisition of hovering by actual UAV using reinforcement learning. " Information Science and Control Engineering (ICISCE), 2016 3rd International Conference on. IEEE, 2016.
Hwangbo, Jemin, et al. "Control of a quadrotor with reinforcement learning. " IEEE Robotics and Automation Letters 2. 4 (2017): 2096-2103.
Balasubramanian, E., and R. Vasantharaj. "Dynamic Modeling and Control of Quad Rotor. " International Journal of Engineering and Technology (IJET) 5 (2013): 63-69.
Imanberdiyev, Nursultan, et al. "Autonomous navigation of UAV by using real-time model-based reinforcement learning. " Control, Automation, Robotics and Vision (ICARCV), 2016 14th International Conference on. IEEE, 2016.
Schulman, John, et al. "Trust region policy optimization. " International Conference on Machine Learning. 2015 (Revised 20 Apr 2017).
Zamora, Iker, et al. "Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo. " arXiv preprint arXiv:1608. 05742 (2016).
Furrer, Fadri, et al. "RotorsA modular gazebo mav simulator framework. " Robot Operating System (ROS). Springer, Cham, 2016. 595-625.
Henderson, Peter, et al. "Deep reinforcement learning that matters. " Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
Duan, Yan, et al. "Benchmarking deep reinforcement learning for continuous control. " International Conference on Machine Learning. 2016.
Nachum, Ofir, et al. "Trust-pcl: An off-policy trust region method for continuous control. " arXiv preprint arXiv:1707. 01891 (2017).
Islam, Riashat, et al. "Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. " arXiv preprint arXiv:1708. 04133 (2017).
Schulman, John, et al. "Proximal policy optimization algorithms. " arXiv preprint arXiv:1707. 06347 (2017). APA
Alaimo, A., et al. "Mathematical modeling and control of a hexacopter. " Unmanned Aircraft Systems (ICUAS), 2013 International Conference on. IEEE, 2013.
Lippiello, Vincenzo, and Fabio Ruggiero. "Exploiting redundancy in Cartesian impedance control of UAVs equipped with a robotic arm. " Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012.