LEO satellites; resources allocation; reinforcement learning
Abstract :
[en] Low earth orbit (LEO) satellite-assisted communications have been considered as one of the key elements in beyond 5G systems to provide wide coverage and cost-efficient data services. Such dynamic space-terrestrial topologies impose an exponential increase in the degrees of freedom in network management. In this paper, we address two practical issues for an over-loaded LEO-terrestrial system. The first challenge is how to efficiently schedule resources to serve a massive number of connected users, such that more data and users can be delivered/served. The second challenge is how to make the algorithmic solution more resilient in adapting to dynamic wireless environments. We first propose an iterative suboptimal algorithm to provide an offline benchmark. To adapt to unforeseen variations, we propose an enhanced meta-critic learning algorithm (EMCL), where a hybrid neural network for parameterization and the Wolpertinger policy for action mapping are designed in EMCL. The results demonstrate EMCL’s effectiveness and fast-response capabilities in over-loaded systems and in adapting to dynamic environments compare to previous actor-critic and meta-learning methods.
Disciplines :
Electrical & electronics engineering
Author, co-author :
Yuan, Yaxiong
Lei, Lei; Xi'an Jiaotong University
Vu, Thang Xuan ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom
Chang, Zheng; University of Jyväskylä
Chatzinotas, Symeon ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom
Sun, Sumei
External co-authors :
yes
Language :
English
Title :
Adapting to Dynamic LEO-B5G Systems: Meta-Critic Learning Based Efficient Resource Scheduling
Publication date :
November 2022
Journal title :
IEEE Transactions on Wireless Communications
ISSN :
1558-2248
Publisher :
Institute of Electrical and Electronics Engineers, New York, United States - New York
Volume :
21
Issue :
11
Pages :
9582-9595
Peer reviewed :
Peer Reviewed verified by ORBi
European Projects :
H2020 - 742648 - AGNOSTIC - Actively Enhanced Cognition based Framework for Design of Complex Systems
FnR Project :
FNR13696663 - Resource Optimization For Next Generation Of Flexible Satellite Payloads, 2019 (01/03/2020-31/08/2023) - Eva Lagunas
Y. Li, E. Pateromichelakis, N. Vucic, J. Luo, W. Xu, and G. Caire, "Radio resource management considerations for 5G millimeter wave backhaul and access networks, " IEEE Commun. Mag., vol. 55, no. 6, pp. 86-92, Jun. 2017.
O. Kodheli, "Satellite communications in the new space era: A survey and future challenges, " IEEE Commun. Surveys Tuts., vol. 23, no. 1, pp. 70-109, 4th Quart., 2021.
L. You, K.-X. Li, J. Wang, X. Gao, X.-G. Xia, and B. Ottersten, "Massive MIMO transmission for LEO satellite communications, " IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1851-1865, Aug. 2020.
B. Di, H. Zhang, L. Song, Y. Li, and G. Y. Li, "Ultra-dense LEO: Integrating terrestrial-satellite networks into 5G and beyond for data offloading, " IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 47-62, Jan. 2019.
Y. Li, N. Deng, and W. Zhou, "A hierarchical approach to resource allocation in extensible multi-layer LEO-MSS, " IEEE Access, vol. 8, pp. 18522-18537, 2020.
S. Wang, Y. Li, Q. Wang, M. Su, and W. Zhou, "Dynamic downlink resource allocation based on imperfect estimation in LEO-HAP cognitive system, " in Proc. 11th Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct. 2019, pp. 1-6.
J. H. Lee, J. Park, M. Bennis, and Y. C. Ko, "Integrating LEO satellite and UAV relaying via reinforcement learning for non-terrestrial networks, " in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1-6.
S. He, T. Wang, and S. Wang, "Load-aware satellite handover strategy based on multi-agent reinforcement learning, " in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1-6.
B. Deng, C. Jiang, H. Yao, S. Guo, and S. Zhao, "The next generation heterogeneous satellite communication networks: Integration of resource management and deep reinforcement learning, " IEEE Wireless Commun., vol. 27, no. 2, pp. 105-111, Apr. 2020.
L. Lei, Y. Yuan, T. X. Vu, S. Chatzinotas, M. Minardi, and J. F. M. Montoya, "Dynamic-adaptive AI solutions for network slicing management in satellite-integrated B5G systems, " IEEE Netw., vol. 35, no. 6, pp. 91-97, Nov. 2021.
Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, "LORM: Learning to optimize for resource management in wireless networks with few training samples, " IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 665-679, Jan. 2020.
O. Simeone, S. Park, and J. Kang, "From learning to meta-learning: Reduced training overhead and complexity for communication systems, " in Proc. 6G Wireless Summit (6G SUMMIT), Mar. 2020, pp. 1-5.
H. Sun, W. Pu, M. Zhu, X. Fu, T.-H. Chang, and M. Hong, "Learning to continuously optimize wireless resource in episodically dynamic environment, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Jun. 2021, pp. 4945-4949.
K. Javed and M. White, "Meta-learning representations for continual learning, " in Proc. Neural Inf. Process. Syst. (NIPS), Dec. 2019, pp. 1820-1830.
C. Finn, P. Abbeel, and S. Levine. "Model-agnostic meta-learning for fast adaptation of deep networks, " in Proc. 34th Int. Conf. Mach. Learn., Aug. 2017, pp. 1126-1135.
K. Rakelly, A. Zhou, C. Finn, S. Levine, and D. Quillen, "Efficient off-policy meta-reinforcement learning via probabilistic context variables, " in Proc. Int. Conf. Mach. Learn. (ICML), 2019, pp. 5331-5340.
F. Sung, L. Zhang, T. Xiang, T. Hospedales, and Y. Yang, "Learning to learn: Meta-critic networks for sample efficient learning, " 2017, arXiv:1706.09529.
H. Wu, Z. Zhang, C. Jiao, C. Li, and T. Q. S. Quek, "Learn to sense: A meta-learning-based sensing and fusion framework for wireless sensor networks, " IEEE Internet Things J., vol. 6, no. 5, pp. 8215-8227, Oct. 2019.
S. Park, O. Simeone, and J. Kang, "Meta-learning to communicate: Fast end-to-end training for fading channels, " in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2020, pp. 5075-5079.
Study on architecture aspects for using satellite access in 5G (Release 17), 3GPP, document TR23.737, 2021.
M. Cheng, J.-B. Wang, J. Cheng, J.-Y. Wang, and M. Lin, "Joint scheduling and precoding for mmWave and sub-6 GHz dual-mode networks, " IEEE Trans. Veh. Technol., vol. 69, no. 11, pp. 13098-13111, Nov. 2020.
X. Fang, W. Feng, T. Wei, Y. Chen, N. Ge, and C.-X. Wang, "5G embraces satellites for 6G ubiquitous IoT: Basic models for integrated satellite terrestrial networks, " IEEE Internet Things J., vol. 8, no. 18, pp. 14399-14417, Sep. 2021.
X. Zhu, C. Jiang, L. Kuang, N. Ge, and J. Lu, "Energy efficient resource allocation in cloud based integrated terrestrial-satellite networks, " in Proc. IEEE ICC, May 2018, pp. 1-6.
W. Wang, T. Chen, R. Ding, G. Seco-Granados, L. You, and X. Gao, "Location-based timing advance estimation for 5G integrated LEO satellite communications, " IEEE Trans. Veh. Technol., vol. 70, no. 6, pp. 6002-6017, Jun. 2021.
A. Alsharoa and M.-S. Alouini, "Improvement of the global connectivity using integrated satellite-airborne-terrestrial networks with resource optimization, " IEEE Trans. Wireless Commun., vol. 19, no. 8, pp. 5088-5100, Aug. 2020.
Doppler Compensation, Uplink Timing Advance and Random Access in NTN, document TSG RAN WG1 Meeting, R1-1906087, 3GPP, May 2019.
K. Guo et al., "Performance analysis of hybrid satellite-terrestrial cooperative networks with relay selection, " IEEE Trans. Veh. Technol., vol. 69, no. 8, pp. 9053-9067, Aug. 2020.
H. S. Wang and N. Moayeri, "Finite-state Markov channel-A useful model for radio communication channels, " IEEE Trans. Veh. Technol., vol. 44, no. 1, pp. 163-171, Feb. 1995.
L. Lei, D. Yuan, C. K. Ho, and S. Sun, "Optimal cell clustering and activation for energy saving in load-coupled wireless networks, " IEEE Trans. Wireless Commun., vol. 14, no. 11, pp. 6150-6163, Nov. 2015.
J. Nocedal and S. Wright, Numerical Optimization. Springer: New York, NY, USA, 2006.
Digital Video Broadcasting (DVB); Implementation Guidelines for the Second Generation System for Broadcasting, Interactive Services, News Gathering and Other Broadband Satellite Applications; Part 2: S2 Extensions (DVB-S2X), document DVB A171-2, Apr. 2020.
Study on LTE-Based 5G Terrestrial Broadcast (Release 16), 3GPP, document TR 36.776, 2019.
C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity. Mineola, NY, USA: Dover, 1998.
S. Boyd, N. Parikh, and E. Chu, Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers. Hanover, MA, USA: Now Publishers, 2011.
T. Lin, S. Ma, and S. Zhang, "Iteration complexity analysis of multiblock ADMM for a family of convex minimization without strong convexity, " J. Sci. Comput., vol. 69, no. 1, pp. 52-81, Oct. 2016.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 2018.
Y. Wei, F. R. Yu, M. Song, and Z. Han, "User scheduling and resource allocation in HetNets with hybrid energy supply: An actor-critic reinforcement learning approach, " IEEE Trans. Wireless Commun., vol. 17, no. 1, pp. 680-692, Jan. 2018.
J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang, "Learning under concept drift: A review, " IEEE Trans. Knowl. Data Eng., vol. 31, no. 12, pp. 2346-2363, Dec. 2019.
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. London, U.K.: MIT Press, 2016.
J. Ye and H. Gharavi, "Deep reinforcement learning-assisted energy harvesting wireless networks, " IEEE Trans. Green Commun. Netw., vol. 5, no. 2, pp. 990-1002, Jun. 2021.
K. He and J. Sun, "Convolutional neural networks at constrained time cost, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 5353-5360.
H. Sak, A. W. Senior, and F. Beaufays, "Long short-term memory recurrent neural network architectures for large scale acoustic modeling, " in Proc. Int. Speech Commun. Assoc., 2014, pp. 338-342.
Y. Yuan, L. Lei, T. X. Vu, S. Chatzinotas, S. Sun, and B. Ottersten, "Energy minimization in UAV-aided networks: Actor-critic learning for constrained scheduling optimization, " IEEE Trans. Veh. Technol., vol. 70, no. 5, pp. 5028-5042, May 2021.
Z. Jiang, S. Chen, S. Zhou, and Z. Niu, "Joint user scheduling and beam selection optimization for beam-based massive MIMO downlinks, " IEEE Trans. Wireless Commun., vol. 17, no. 4, pp. 2190-2204, Jan. 2018.
D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, "Deterministic policy gradient algorithms, " in Neural Inf. Process. Syst. (NIPS), Jun. 2014, pp. 387-395.