Keywords :
improved DDPG; LSTM network modeling; model predictive control; path planning; reinforcement learning; Collisions avoidance; Complex environments; Deterministics; Improved DDPG; Long-short-term memory network modeling; Memory network; Model-predictive control; Network models; Reinforcement learning method; Reinforcement learnings; Control and Optimization; Aerospace Engineering
Abstract :
[en] In this paper, we tackle the problem of Unmanned Aerial (UAV) path planning in complex and uncertain environments by designing a Model Predictive Control (MPC), based on a Long-Short-Term Memory (LSTM) network integrated into the Deep Deterministic Policy Gradient algorithm. In the proposed solution, LSTM-MPC operates as a deterministic policy within the DDPG network, and it leverages a predicting pool to store predicted future states and actions for improved robustness and efficiency. The use of the predicting pool also enables the initialization of the critic network, leading to improved convergence speed and reduced failure rate compared to traditional reinforcement learning and deep reinforcement learning methods. The effectiveness of the proposed solution is evaluated by numerical simulations.
Funding text :
ACKNOWLEDGMENTS This research was partially supported by the European Union’s Horizon 2020 project Secure and Safe Multi-Robot Systems (SESAME) under grant agreement no. 101017258. For the purpose of open access, the author has applied a Creative Commons Attribution 4.0 International (CC BY 4.0) license to any Author Accepted Manuscript version arising from this submission.This research was partially supported by the European Union s Horizon 2020 project Secure and Safe Multi-Robot Systems (SESAME) under grant agreement no. 101017258. For the purpose of open access, the author has applied a Creative Commons Attribution 4.0 International (CC BY 4.0) license to any Author Accepted Manuscript version arising from this submission.
Scopus citations®
without self-citations
1