Voos, Holger[University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Engineering Research Unit >]
26-Apr-2019
International Conference on Control, Decision and Information CoDIT, Paris 23-26 April 2019
IEEE
Yes
No
International
International Conference on Control, Decision and Information CoDIT
from 23-04-2019 to 26-04-2019
Paris
France
[en] Reinforcement Learning ; UAV ; TRPO
[en] In this paper we apply deep reinforcement learning techniques on a multicopter for learning a stable hovering task in
a continuous action state environment. We present a framework based on OpenAI GYM, Gazebo and RotorS MAV simulator, utilized for successfully training different agents to perform various tasks. The deep reinforcement learning method used for the training is model-free, on-policy, actor-critic based algorithm called Trust Region Policy Optimization (TRPO). Two neural networks have been used as a nonlinear function approximators. Our experiments showed that such learning approach achieves successful results, and facilitates the process of controller design.
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > Automation & Robotics Research Group
Researchers ; Professionals ; Students ; General public ; Others