Reference : Actor-Critic Deep Reinforcement Learning for Energy Minimization in UAV-Aided Networks
Scientific congresses, symposiums and conference proceedings : Paper published in a journal
Engineering, computing & technology : Electrical & electronics engineering
Security, Reliability and Trust
http://hdl.handle.net/10993/44577
Actor-Critic Deep Reinforcement Learning for Energy Minimization in UAV-Aided Networks
English
Yuan, Yaxiong mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Lei, Lei mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Vu, Thang Xuan mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Chatzinotas, Symeon mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Ottersten, Björn mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
21-Sep-2020
2020 European Conference on Networks and Communications (EuCNC)
Yes
International
2475-6490
2575-4912
2020 European Conference on Networks and Communications (EuCNC)
from 15-06-2020 to 18-06-2020
[en] UAV-aided networks ; Deep reinforcement learning ; Actor-Critic ; User scheduling ; Energy minimization
[en] In this paper, we investigate a user-timeslot scheduling problem for downlink unmanned aerial vehicle (UAV)-aided networks, where the UAV serves as an aerial base station. We
formulate an optimization problem by jointly determining user scheduling and hovering time to minimize UAV’s transmission and hovering energy. An offline algorithm is proposed to solve the problem based on the branch and bound method and the golden section search. However, executing the offline algorithm suffers from the exponential growth of computational time. Therefore, we apply a deep reinforcement learning (DRL) method to design an online algorithm with less computational time. To this end, we first reformulate the original user scheduling problem to a Markov decision process (MDP). Then, an actor-critic-based RL algorithm is developed to determine the scheduling policy under the guidance of two deep neural networks. Numerical results show the proposed online algorithm obtains a good tradeoff between performance gain and computational time.
Researchers ; Professionals ; Students
http://hdl.handle.net/10993/44577
H2020 ; 742648 - AGNOSTIC - Actively Enhanced Cognition based Framework for Design of Complex Systems
FnR ; FNR11632107 > Lei Lei > ROSETTA > Resource Optimization for Integrated Satellite-5G Networks with Non-Orthogonal Multiple Access > 01/09/2018 > 31/08/2021 > 2017

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Limited access
EUCNC paper.pdfPublisher postprint484.95 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.