References of "Han, Junfeng"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailDelay Constrained Resource Allocation for NOMA Enabled Satellite Internet of Things with Deep Reinforcement Learning
Yan, Xiaojuan; An, Kang; Zhang, Qianfeng et al

in IEEE Internet of Things Journal (2020)

With the ever increasing requirement of transferring data from/to smart users within a wide area, satellite internet of things (S-IoT) networks has emerged as a promising paradigm to provide cost ... [more ▼]

With the ever increasing requirement of transferring data from/to smart users within a wide area, satellite internet of things (S-IoT) networks has emerged as a promising paradigm to provide cost-effective solution for remote and disaster areas. Taking into account the diverse link qualities and delay qualityof- service (QoS) requirements of S-IoT devices, we introduce a power domain non-orthogonal multiple access (NOMA) scheme in the downlink S-IoT networks to enhance resource utilization efficiency and employ the concept of effective capacity to show delay-QoS requirements of S-IoT traffics. Firstly, resource allocation among NOMA users is formulated with the aim of maximizing sum effective capacity of the S-IoT while meeting the minimum capacity constraint of each user. Due to the intractability and non-convexity of the initial optimization problem, especially in the case of large-scale user-pair in NOMA enabled S-IoT. This paper employs a deep reinforcement learning (DRL) algorithm for dynamic resource allocation. Specifically, channel conditions and/or delay-QoS requirements of NOMA users are carefully selected as state according to exact closed-form expressions as well as low-SNR and high-SNR approximations, a deep Q network is first adopted to yet reward and output the optimum power allocation coefficients for all users, and then learn to adjust the allocation policy by updating the weights of neural networks using gained experiences. Simulation results are provided to demonstrate that with a proper discount factor, reward design, and training mechanism, the proposed DRL based power allocation scheme can output optimal/near-optimal action in each time slot, and thus, provide superior performance than that achieved with a fixed power allocation strategy and orthogonal multiple access (OMA) scheme. [less ▲]

Detailed reference viewed: 87 (7 UL)