References of "Yuan, Yaxiong 50034266"
     in
Bookmark and Share    
Full Text
See detailMachine Learning-Based Efficient Resource Scheduling for Future Wireless Communication Networks
Yuan, Yaxiong UL

Doctoral thesis (2022)

The next-generation mobile communication system, e.g., 6G communication system, is envisioned to support unprecedented performance requirements such as exponentially increasing data requests ... [more ▼]

The next-generation mobile communication system, e.g., 6G communication system, is envisioned to support unprecedented performance requirements such as exponentially increasing data requests, heterogeneous service demands, and massive connectivity. When these challenging tasks meet the scarcity of wireless resources, efficient resource management becomes crucial. Conventionally, optimization algorithms, either optimal or suboptimal, are the main approaches for solving resource allocation problems. However, the efficiency of these iterative optimization algorithms can significantly degrade when the problems become large or difficult, e.g., non-convex or combinatorial optimization problems. Over the past few years, machine learning (ML), as an emerging approach in the toolbox, is widely investigated to accelerate the decision-making process. Since applying ML-based approaches to solve complex resource management problems is in its early-stage study, many open issues and challenges need to be solved towards the maturity and practical applications. The motivation and objective of this dissertation lie at investigating and providing answers to the following research questions: 1) How to overcome the shortcomings of extensively adopted end-to-end learning in addressing resource management problems, and which types of features are suited to be learned if supervised learning is applied? 2) What are the limitations and benefits when widely-used deep reinforcement learning (DRL) approaches are used to address constrained and combinatorial optimization problems in wireless networks, and are there tailored solutions to overcome the inherent drawbacks? 3) How to enable ML-based approaches to timely adapt to dynamic and complex wireless environments? 4) How to enlarge the performance gains when the paradigm shifts from centralized learning to distributed learning? The main contributions are organized by the following four research works. Firstly, from a supervised-learning perspective, we address common issues, e.g., unsatisfactory pre- diction performance and resultant infeasible solutions, when end-to-end learning approaches are applied to resource scheduling problems. Based on the analysis of optimal results, we design suited-to-learn features for a class of resource scheduling problems, and develop combined learning-and-optimization approaches to enable time-efficient and energy-efficient resource scheduling in multi-antenna systems. The original optimization problems are mixed-integer programming problems with high-dimensional decision vectors. The optimal solution requires exponential complexity due to the inherent difficulties of the problems. Towards an efficient and competitive solution, we apply fully-connected deep neural network (DNN) and convolutional neural network (CNN) to learn the designed features. The predicted information can effectively reduce the large search space and accelerate the optimization process. Compared to the conventional optimization and pure ML algorithms, the proposed method achieves a good trade-off between quality and complexity. Secondly, we address typical issues when DRL is adopted to deal with combinatorial and non-convex scheduling problems. The original problem is to provide energy-saving solutions via resource scheduling in energy-constrained networks. An optimal algorithm and a golden section search suboptimal approach are developed to serve as offline benchmarks. For online operations, we propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm. Compared to supervised learning, DRL is suitable for dynamic environments and capable of making decisions based on the current state without an offline training phase. However, for the specific constrained scheduling problem, conventional DRL may not be able to handle two major issues of exponentially-increased action space and infeasible actions. The proposed AC-DSOS is developed to overcome these drawbacks. In simulations, AC-DSOS is able to provide feasible solutions and save around more energy compared to the conventional DRL algorithms. Compared to the offline benchmarks, AC-DSOS reduces the computational time from second-level to millisecond-level. Thirdly, the dissertation pays attention to the performance of the ML-based approaches in highly dynamic and complex environments. Most of the ML models are trained by the collected data or the observed environments. They may not be able to timely respond to the large variations of environments, such as dramatically fluctuating channel states or bursty data demands. In this work, we develop ML-based approaches in a time-varying satellite-terrestrial network and address two practical issues. The first is how to efficiently schedule resources to serve the massive number of connected users, such that more data and users can be delivered/served. The second is how to make the algorithmic solution more resilient in adapting to the time-varying wireless environments. We propose an enhanced meta-critic learning (EMCL) algorithm, combining a DRL model with a meta-learning technique, where the meta-learning can acquire meta-knowledge from different tasks and fast adapt to the new task. The results demonstrate EMCL’s effectiveness and fast-response capabilities in over-loaded systems and in adapting to dynamic environments compare to previous actor-critic and meta-learning methods. Fourthly, the dissertation focuses on reducing the energy consumption for federated learning (FL), in mobile edge computing. The power supply and computation capabilities are typically limited in edge devices, thus energy becomes a critical issue in FL. We propose a joint sparsification and resource optimization scheme (JSRO) to jointly reduce computational and transmission energy. In the first part of JSRO, we introduce sparsity and adopt sparse or binary neural networks (SNN or BNN) as the learning model to complete the local training tasks at the devices. Compared to fully-connected DNN, the computational operations can be significantly reduced, and thus requires less energy consumption and fewer transmitted data to the central node. In the second part, we develop an efficient scheduling scheme to minimize the overall transmission energy by optimizing wireless resources and learning parameters. We develop an enhanced FL algorithm in JSRO, i.e., non-smoothness and constraints - stochastic gradient descent, to handle the non-smoothness and constraints of SNN and BNN, and provide guarantees for convergence. Finally, we conclude the thesis with the main findings and insights on future research directions. [less ▲]

Detailed reference viewed: 70 (13 UL)
Full Text
Peer Reviewed
See detailEnergy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization
Yuan, Yaxiong UL; Lei, Lei UL; Vu, Thang Xuan UL et al

in IEEE Transactions on Vehicular Technology (2021)

In unmanned aerial vehicle (UAV) applications, the UAV's limited energy supply and storage have triggered the development of intelligent energy-conserving scheduling solutions. In this paper, we ... [more ▼]

In unmanned aerial vehicle (UAV) applications, the UAV's limited energy supply and storage have triggered the development of intelligent energy-conserving scheduling solutions. In this paper, we investigate energy minimization for UAV-aided communication networks by jointly optimizing data-transmission scheduling and UAV hovering time. The formulated problem is combinatorial and non-convex with bilinear constraints. To tackle the problem, firstly, we provide an optimal relax-and-approximate solution and develop a near-optimal algorithm. Both the proposed solutions are served as offline performance benchmarks but might not be suitable for online operation. To this end, we develop a solution from a deep reinforcement learning (DRL) aspect. The conventional RL/DRL, e.g., deep Q-learning, however, is limited in dealing with two main issues in constrained combinatorial optimization, i.e., exponentially increasing action space and infeasible actions. The novelty of solution development lies in handling these two issues. To address the former, we propose an actor-critic-based deep stochastic online scheduling (AC-DSOS) algorithm and develop a set of approaches to confine the action space. For the latter, we design a tailored reward function to guarantee the solution feasibility. Numerical results show that, by consuming equal magnitude of time, AC-DSOS is able to provide feasible solutions and saves 29.94% energy compared with a conventional deep actor-critic method. Compared to the developed near-optimal algorithm, AC-DSOS consumes around 10% higher energy but reduces the computational time from minute-level to millisecond-level. [less ▲]

Detailed reference viewed: 95 (27 UL)
Full Text
Peer Reviewed
See detailActor‑critic learning‑based energy optimization for UAV access and backhaul networks
Yuan, Yaxiong UL; Lei, Lei UL; Vu, Thang Xuan UL et al

in EURASIP Journal on Wireless Communications and Networking (2021)

In unmanned aerial vehicle (UAV)-assisted networks, UAV acts as an aerial base station which acquires the requested data via backhaul link and then serves ground users (GUs) through an access network. In ... [more ▼]

In unmanned aerial vehicle (UAV)-assisted networks, UAV acts as an aerial base station which acquires the requested data via backhaul link and then serves ground users (GUs) through an access network. In this paper, we investigate an energy minimization problem with a limited power supply for both backhaul and access links. The difficul- ties for solving such a non-convex and combinatorial problem lie at the high compu- tational complexity/time. In solution development, we consider the approaches from both actor-critic deep reinforcement learning (AC-DRL) and optimization perspectives. First, two offline non-learning algorithms, i.e., an optimal and a heuristic algorithms, based on piecewise linear approximation and relaxation are developed as benchmarks. Second, toward real-time decision-making, we improve the conventional AC-DRL and propose two learning schemes: AC-based user group scheduling and backhaul power allocation (ACGP), and joint AC-based user group scheduling and optimization-based backhaul power allocation (ACGOP). Numerical results show that the computation time of both ACGP and ACGOP is reduced tenfold to hundredfold compared to the offline approaches, and ACGOP is better than ACGP in energy savings. The results also verify the superiority of proposed learning solutions in terms of guaranteeing the feasibility and minimizing the system energy compared to the conventional AC-DRL. [less ▲]

Detailed reference viewed: 80 (16 UL)
Full Text
Peer Reviewed
See detailActor-Critic Deep Reinforcement Learning for Energy Minimization in UAV-Aided Networks
Yuan, Yaxiong UL; Lei, Lei UL; Vu, Thang Xuan UL et al

in 2020 European Conference on Networks and Communications (EuCNC) (2020, September 21)

In this paper, we investigate a user-timeslot scheduling problem for downlink unmanned aerial vehicle (UAV)-aided networks, where the UAV serves as an aerial base station. We formulate an optimization ... [more ▼]

In this paper, we investigate a user-timeslot scheduling problem for downlink unmanned aerial vehicle (UAV)-aided networks, where the UAV serves as an aerial base station. We formulate an optimization problem by jointly determining user scheduling and hovering time to minimize UAV’s transmission and hovering energy. An offline algorithm is proposed to solve the problem based on the branch and bound method and the golden section search. However, executing the offline algorithm suffers from the exponential growth of computational time. Therefore, we apply a deep reinforcement learning (DRL) method to design an online algorithm with less computational time. To this end, we first reformulate the original user scheduling problem to a Markov decision process (MDP). Then, an actor-critic-based RL algorithm is developed to determine the scheduling policy under the guidance of two deep neural networks. Numerical results show the proposed online algorithm obtains a good tradeoff between performance gain and computational time. [less ▲]

Detailed reference viewed: 87 (11 UL)
Full Text
Peer Reviewed
See detailBeam Illumination Pattern Design in Satellite Networks: Learning and Optimization for Efficient Beam Hopping
Lei, Lei UL; Lagunas, Eva UL; Yuan, Yaxiong UL et al

in IEEE Access (2020)

Beam hopping (BH) is considered to provide a high level of flexibility to manage irregular and time-varying traffic requests in future multi-beam satellite systems. In BH optimization, adopting ... [more ▼]

Beam hopping (BH) is considered to provide a high level of flexibility to manage irregular and time-varying traffic requests in future multi-beam satellite systems. In BH optimization, adopting conventional iterative heuristics may have their own limitations in providing timely solutions, and directly using data-driven technique to approximate optimization variables may lead to constraint violation and degraded performance. In this paper, we explore a combined learning-and-optimization (L&O) approach to provide an efficient, feasible, and near-optimal solution. The investigations are from the following aspects: 1) Integration ofBH optimization and learning techniques; 2) Features to be learned in BH design; 3) How to address the feasibility issue incurred by machine learning. We provide numerical results and analysis to show that the learning component in L&O significantly accelerates the procedure of identifying promising BH patterns, resulting in reduced computing time from seconds/minutes to milliseconds level. The identified learning feature enables high accuracy in predictions. In addition, the optimization component in L&O guarantees the solution’s feasibility and improves the overall performance with around 5% gap to the optimum. [less ▲]

Detailed reference viewed: 160 (49 UL)
Full Text
Peer Reviewed
See detailProxSGD: Training Structured Neural Networks under Regularization and Constraints
Yang, Yang; Yuan, Yaxiong UL; Chatzimichailidis, Avraam et al

in International Conference on Learning Representations (ICLR) 2020 (2020)

Detailed reference viewed: 59 (4 UL)