Results 1-20 of 33.
((uid:50008504))
![]() Kieffer, Emmanuel ![]() in Frontiers in Artificial Intelligence in Finance (2023) Keeping strategic allocations at target level to maintain high exposure to private equity is a complex but essential task for investors who need to balance against the risk of default. Illiquidity and ... [more ▼] Keeping strategic allocations at target level to maintain high exposure to private equity is a complex but essential task for investors who need to balance against the risk of default. Illiquidity and cashflow uncertainty are critical challenges especially when commitments are irrevocable. In this work, we propose to use a trustworthy and explainable A.I. approach to design recommitment strategies. Using intensive portfolios simulations and evolutionary computing, we show that efficient and dynamic recommitment strategies can be brought forth automatically. [less ▲] Detailed reference viewed: 86 (5 UL)![]() Fischbach, Tobias Michael ![]() ![]() ![]() Scientific Conference (2023) ith the advent of the Exascale capability allowing supercomputers to perform at least 1018 IEEE 754 Double Precision (64 bits) operations per second, many concerns have been raised regarding the energy ... [more ▼] ith the advent of the Exascale capability allowing supercomputers to perform at least 1018 IEEE 754 Double Precision (64 bits) operations per second, many concerns have been raised regarding the energy consumption of high-performance computing code. Recently, Frontier operated by the Oak Ridge National Laboratory, has become the first supercomputer to break the exascale barrier [1]. In total, Frontier contains 9,408 CPUs, 37,632 GPUs, and 8,730,112 cores. This world-leading supercomputer consumes about 21 megawatts which is truly remarkable as Frontier was also ranked first on the Green500 list before being recently replaced. The previous top Green500 machine, MN-3 in Japan, provided 39.38 gigaflops per watt, while the Frontier delivered 62.68 gigaflops per watt. All these infrastructure and hardware improvements are just the tip of the Iceberg. Energy-aware code is now required to minimize the energy consumption of distributed and/or multi-threaded software. For example, the data movement bottleneck is responsible for 35 − 60% of a system’s energy consumption during intra-node communication. In an HPC environment, additional energy is consumed through inter-node communication. This position paper aims to introduce future research directions to enter now in the age of energy-aware software. The paper is organized as follows. First, we introduce related works regarding measurement and energy optimisation. Then we propose to focus on the two different levels of granularity in energy optimisation. [less ▲] Detailed reference viewed: 26 (3 UL)![]() Varrette, Sébastien ![]() ![]() in 21st IEEE Intl. Symp. on Parallel and Distributed Computing (ISPDC'22) (2022, July) High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such ... [more ▼] High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such large scale and distributed computing infrastructures is left to a Resource and Job Management System (RJMS). This essential middleware component is responsible for managing the computing resources, handling user requests to allocate resources while providing an optimized framework for starting, executing and monitoring jobs on the allocated resources. The University of Luxembourg has been operating for 15 years a large academic HPC facility which relies since 2017 on the Slurm RJMS introduced on top of the flagship cluster Iris. The acquisition of a new liquid-cooled supercomputer named Aion which was released in 2021 was the occasion to deeply review and optimize the seminal Slurm configuration, the resource limits defined and the sustaining fairsharing algorithm. This paper presents the outcomes of this study and details the implemented RJMS policy. The impact of the decisions made over the supercomputers workloads is also described. In particular, the performance evaluation conducted highlights that when compared to the seminal configuration, the described and implemented environment brought concrete and measurable improvements with regards the platform utilization (+12.64%), the jobs efficiency (as measured by the average Wall-time Request Accuracy, improved by 110.81%) or the management and funding (increased by 10%). The systems demonstrated sustainable and scalable HPC performances, and this effort has led to a negligible penalty on the average slowdown metric (response time normalized by runtime), which was increased by 0.59% for job workloads covering a complete year of exercise. Overall, this new setup has been in production for 18 months on both supercomputers and the updated model proves to bring a fairer and more satisfying experience to the end users. The proposed configurations and policies may help other HPC centres when designing or improving the RJMS sustaining the job scheduling strategy at the advent of computing capacity expansions. [less ▲] Detailed reference viewed: 73 (11 UL)![]() Varrette, Sébastien ![]() ![]() ![]() in 6th High Performance Computing and Cluster Technologies Conference (HPCCT 2022) (2022, July) With the advent of the technological revolution and the digital transformation that made all scientific disciplines becoming computational, the need for High Performance Computing (HPC) has become and a ... [more ▼] With the advent of the technological revolution and the digital transformation that made all scientific disciplines becoming computational, the need for High Performance Computing (HPC) has become and a strategic and critical asset to leverage new research and business in all domains requiring computing and storage performance. Since 2007, the University of Luxembourg operates a large academic HPC facility which remains the reference implementation within the country. This paper provides a general description of the current platform implementation as well as its operational management choices which have been adapted to the integration of a new liquid-cooled supercomputer, named Aion, released in 2021. The administration of a HPC facility to provide state-of-art computing systems, storage and software is indeed a complex and dynamic enterprise with the soul purpose to offer an enhanced user experience for intensive research computing and large-scale analytic workflows. Most design choices and feedback described in this work have been motivated by several years of experience in addressing in a flexible and convenient way the heterogeneous needs inherent to an academic environment towards research excellence. The different layers and stacks used within the operated facilities are reviewed, in particular with regards the user software management, or the adaptation of the Slurm Resource and Job Management System (RJMS) configuration with novel incentives mechanisms. In practice, the described and implemented environment brought concrete and measurable improvements with regards the platform utilization (+12,64%), jobs efficiency (average Wall-time Request Accuracy improved by 110,81%), the management and funding (increased by 10%). Thorough performance evaluation of the facility is also presented in this paper through reference benchmarks such as HPL, HPCG, Graph500, IOR or IO500. It reveals sustainable and scalable performance comparable to the most powerful supercomputers in the world, including for energy-efficient metrics (for instance, 5,19 GFlops/W (resp. 6,14 MTEPS/W) were demonstrated for full HPL (resp. Graph500) runs across all Aion nodes). [less ▲] Detailed reference viewed: 143 (49 UL)![]() ; Brust, Mathias ![]() in PLoS ONE (2022), 17(1), 1-17 Detailed reference viewed: 130 (40 UL)![]() Kieffer, Emmanuel ![]() ![]() ![]() in A RNN-Based Hyper-Heuristic for Combinatorial Problems (2022) Designing efficient heuristics is a laborious and tedious task that generally requires a full understanding and knowledge of a given optimization problem. Hyper-heuristics have been mainly introduced to ... [more ▼] Designing efficient heuristics is a laborious and tedious task that generally requires a full understanding and knowledge of a given optimization problem. Hyper-heuristics have been mainly introduced to tackle this issue and are mostly relying on Genetic Programming and its variants. Many attempts in the literature have shown that an automatic training mechanism for heuristic learning is possible and can challenge human-based heuristics in terms of gap to optimality. In this work, we introduce a novel approach based on a recent work on Deep Symbolic Regression. We demonstrate that scoring functions can be trained using Recurrent Neural Networks to tackle a well-know combinatorial problem, i.e., the Multi-dimensional Knapsack. Experiments have been conducted on instances from the OR-Library and results show that the proposed modus operandi is an alternative and promising approach to human- based heuristics and classical heuristic generation approaches. [less ▲] Detailed reference viewed: 124 (24 UL)![]() Varrette, Sébastien ![]() ![]() ![]() in ACM Practice and Experience in Advanced Research Computing (PEARC'21) (2021, July) High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large ... [more ▼] High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large-scale Big Data analytic capabilities. The efficient exploitation of heterogeneous computing resources featuring different processor architectures and generations, coupled with the eventual presence of GPU accelerators, remains a challenge. The University of Luxembourg operates since 2007 a large academic HPC facility which remains one of the reference implementation within the country and offers a cutting-edge research infrastructure to Luxembourg public research. The HPC support team invests a significant amount of time (i.e., several months of effort per year) in providing a software environment optimised for hundreds of users, but the complexity of HPC software was quickly outpacing the capabilities of classical software management tools. Since 2014, our scientific software stack is generated and deployed in an automated and consistent way through the RESIF framework, a wrapper on top of Easybuild and Lmod [5] meant to efficiently handle user software generation. A large code refactoring was performed in 2017 to better handle different software sets and roles across multiple clusters, all piloted through a dedicated control repository. With the advent in 2020 of a new supercomputer featuring a different CPU architecture, and to mitigate the identified limitations of the existing framework, we report in this state-of-practice article RESIF 3.0, the latest iteration of our scientific software management suit now relying on streamline Easybuild. It permitted to reduce by around 90% the number of custom configurations previously enforced by specific Slurm and MPI settings, while sustaining optimised builds coexisting for different dimensions of CPU and GPU architectures. The workflow for contributing back to the Easybuild community was also automated and a current work in progress aims at drastically decrease the building time of a complete software set generation. Overall, most design choices for our wrapper have been motivated by several years of experience in addressing in a flexible and convenient way the heterogeneous needs inherent to an academic environment aiming for research excellence. As the code base is available publicly, and as we wish to transparently report also the pitfalls and difficulties met, this tool may thus help other HPC centres to consolidate their own software management stack. [less ▲] Detailed reference viewed: 346 (39 UL)![]() Kieffer, Emmanuel ![]() ![]() in Springer CCIS series (2021) Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty ... [more ▼] Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty and illiquidity. Maintaining a specific target allocation is therefore a tedious and critical task. Unfortunately, recommitment strategies are still manually designed and few works in the literature have endeavored to develop a recommitment system balancing opportunity cost and risk of default. Due to its strong similarities to a control system, we propose to “learn how to recommit” with Reinforcement Learning (RL) and, more specifically, using Proximal Policy Optimisation (PPO). To the best of our knowledge, this is the first attempt a RL algorithm is applied to private equity with the aim to solve the recommitment problematic. After training the RL model on simulated portfolios, the resulting recommitment policy is compared to state-of-the-art strategies. Numerical results suggest that the trained policy can achieve high target allocation while bounding the risk of being overinvested. [less ▲] Detailed reference viewed: 112 (7 UL)![]() Kieffer, Emmanuel ![]() ![]() in 2021 IEEE Symposium Series on Computational Intelligence (SSCI) (2021) Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of ... [more ▼] Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of becoming a defaulting investor. When looking at recommitments we are quickly faced with a combinatorial explosion of the solution space, rendering explicit enumeration impossible. As a consequence, manual management if any is becoming time-consuming and error-prone. For this reason, investors need guidance and decision aid algorithms producing reliable, robust and trustworthy recommitment strategies. In this work, we propose to generate automatically recommitment strategies based on the evolution of symbolic expressions to provide clear and understandable decision rules to Private Equity experts and investors. To the best of our knowledge, this is the first time a methodology to learn recommitment strategies using evolutionary learning is proposed. Experiments demonstrate the capacity of the proposed approach to generate efficient and robust strategies, keeping a high degree of investment while bounding the risk of being overinvested. [less ▲] Detailed reference viewed: 115 (4 UL)![]() Mainassara Chekaraou, Abdoul Wahid ![]() ![]() ![]() in 10th IEEE Workshop on Parallel / Distributed Combinatorics and Optimization (2020, June) The Verlet list method is a well-known bookkeeping technique of the interaction list used both in Molecular Dynamic (MD) and Discrete Element Method (DEM). The Verlet buffer technique is an enhancement of ... [more ▼] The Verlet list method is a well-known bookkeeping technique of the interaction list used both in Molecular Dynamic (MD) and Discrete Element Method (DEM). The Verlet buffer technique is an enhancement of the Verlet list that consists of extending the interaction radius of each particle by an extra margin to take into account more particles in the interaction list. The extra margin is based on the local flow regime of each particle to account for the different flow regimes that can coexist in the domain. However, the choice of the near-optimal extra margin (which ensures the best performance) for each particle and the related parameters remains unexplored in DEM unlike in MD. In this study, we demonstrate that the near-optimal extra margin can fairly be characterized by four parameters that describe each particle local flow regime: the particle velocity, the ratio of the containing cell size to particle size, the containing cell solid fraction, and the total number of particles in the system. For this purpose, we model the near-optimal extra margin as a function of these parameters using a quadratic polynomial function. We use the DAKOTA SOFTWARE to carry out the Design and Analysis of Computer Experiments (DACE) and the sampling of the parameters for the simulations. For a given instance of the set of parameters, a global optimization method is considered to find the near-optimal extra margin. The latter is required for the construction of the quadratic polynomial model. The numerous simulations generated by the sampling of the parameter were performed on a High-Performance Computing (HPC) environment granting parallel and concurrent executions. This work provides a better understanding of the Verlet buffer method in DEM simulations by analyzing its performances and behavior in various configurations. The near-optimal extra margin can reasonably be predicted by two out of the four chosen parameters using the quadratic polynomial model. This model has been integrated into XDEM in order to automatically choose the extra margin without any input from the user. Evaluations on real industrial-level test cases show up to a 26% reduction of the execution time. [less ▲] Detailed reference viewed: 118 (23 UL)![]() Pinel, Frédéric ![]() ![]() in Communications in Computer and Information Science (2020, February) We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be ... [more ▼] We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be embedded in the scheduler of a shared GPU infrastructure, capable of providing estimated training times for a wide range of network architectures, when the user submits a training job. To this end, a very short and simple representation for a given DNN is chosen. In order to compensate for the limited degree of description of the basic network representation, a novel co-evolutionary approach is taken to fit the estimator. The training set for the estimator, i.e. DNNs, is evolved by an evolutionary algorithm that optimizes the accuracy of the estimator. In the process, the genetic algorithm evolves DNNs, generates Python-Keras programs and projects them onto the simple representation. The genetic operators are dynamic, they change with the estimator’s accuracy in order to balance accuracy with generalization. Results show that despite the low degree of information in the representation and the simple initial design for the predictor, co-evolving the training set performs better than near random generated population of DNNs. [less ▲] Detailed reference viewed: 174 (15 UL)![]() ; Kieffer, Emmanuel ![]() ![]() in Journal of Computational Science (2020), 41 Detailed reference viewed: 185 (36 UL)![]() Kieffer, Emmanuel ![]() ![]() ![]() in IEEE Transactions on Evolutionary Computation (2020), 24(1), 44--56 Detailed reference viewed: 145 (25 UL)![]() Varrette, Sébastien ![]() ![]() ![]() in Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019) (2019, December) For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to ... [more ▼] For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the num- ber of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmark- ing application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures. [less ▲] Detailed reference viewed: 234 (62 UL)![]() Duflo, Gabriel ![]() ![]() ![]() in 33rd IEEE International Parallel & Distributed Processing Symposium (IPDPS 2019) (2019, May 20) Detailed reference viewed: 350 (78 UL)![]() Kieffer, Emmanuel ![]() ![]() ![]() in IEEE Transactions on Evolutionary Computation (2019) Combinatorial bi-level optimization remains a challenging topic, especially when the lower-level is a NP-hard problem. In this work, we tackle large-scale and combinatorial bi-level problems using GP ... [more ▼] Combinatorial bi-level optimization remains a challenging topic, especially when the lower-level is a NP-hard problem. In this work, we tackle large-scale and combinatorial bi-level problems using GP Hyper-heuristics, i.e., an approach that permits to train heuristics like a machine learning model. Our contribution aims at targeting the intensive and complex lower-level optimizations that occur when solving a large-scale and combinatorial bi-level problem. For this purpose, we consider hyper-heuristics through heuristic generation. Using a GP hyper-heuristic approach, we train greedy heuristics in order to make them more reliable when encountering unseen lower-level instances that could be generated during bi-level optimization. To validate our approach referred to as GA+AGH, we tackle instances from the Bi-level Cloud Pricing Optimization Problem (BCPOP) that model the trading interactions between a cloud service provider and cloud service customers. Numerical results demonstrate the abilities of the trained heuristics to cope with the inherent nested structure that makes bi-level optimization problems so hard. Furthermore, it has been shown that training heuristics for lower-level optimization permits to outperform human-based heuristics and metaheuristics which constitute an excellent outcome for bi-level optimization. [less ▲] Detailed reference viewed: 258 (47 UL)![]() Duflo, Gabriel ![]() ![]() ![]() Scientific Conference (2019, January 29) Detailed reference viewed: 319 (66 UL)![]() Kieffer, Emmanuel ![]() Doctoral thesis (2019) Multi-level optimization stems from the need to tackle complex problems involving multiple decision makers. Two-level optimization, referred as ``Bi-level optimization'', occurs when two decision makers ... [more ▼] Multi-level optimization stems from the need to tackle complex problems involving multiple decision makers. Two-level optimization, referred as ``Bi-level optimization'', occurs when two decision makers only control part of the decision variables but impact each other (e.g., objective value, feasibility). Bi-level problems are sequential by nature and can be represented as nested optimization problems in which one problem (the ``upper-level'') is constrained by another one (the ``lower-level''). The nested structure is a real obstacle that can be highly time consuming when the lower-level is $\mathcal{NP}-hard$. Consequently, classical nested optimization should be avoided. Some surrogate-based approaches have been proposed to approximate the lower-level objective value function (or variables) to reduce the number of times the lower-level is globally optimized. Unfortunately, such a methodology is not applicable for large-scale and combinatorial bi-level problems. After a deep study of theoretical properties and a survey of the existing applications being bi-level by nature, problems which can benefit from a bi-level reformulation are investigated. A first contribution of this work has been to propose a novel bi-level clustering approach. Extending the well-know ``uncapacitated k-median problem'', it has been shown that clustering can be easily modeled as a two-level optimization problem using decomposition techniques. The resulting two-level problem is then turned into a bi-level problem offering the possibility to combine distance metrics in a hierarchical manner. The novel bi-level clustering problem has a very interesting property that enable us to tackle it with classical nested approaches. Indeed, its lower-level problem can be solved in polynomial time. In cooperation with the Luxembourg Centre for Systems Biomedicine (LCSB), this new clustering model has been applied on real datasets such as disease maps (e.g. Parkinson, Alzheimer). Using a novel hybrid and parallel genetic algorithm as optimization approach, the results obtained after a campaign of experiments have the ability to produce new knowledge compared to classical clustering techniques combining distance metrics in a classical manner. The previous bi-level clustering model has the advantage that the lower-level can be solved in polynomial time although the global problem is by definition $\mathcal{NP}$-hard. Therefore, next investigations have been undertaken to tackle more general bi-level problems in which the lower-level problem does not present any specific advantageous properties. Since the lower-level problem can be very expensive to solve, the focus has been turned to surrogate-based approaches and hyper-parameter optimization techniques with the aim of approximating the lower-level problem and reduce the number of global lower-level optimizations. Adapting the well-know bayesian optimization algorithm to solve general bi-level problems, the expensive lower-level optimizations have been dramatically reduced while obtaining very accurate solutions. The resulting solutions and the number of spared lower-level optimizations have been compared to the bi-level evolutionary algorithm based on quadratic approximations (BLEAQ) results after a campaign of experiments on official bi-level benchmarks. Although both approaches are very accurate, the bi-level bayesian version required less lower-level objective function calls. Surrogate-based approaches are restricted to small-scale and continuous bi-level problems although many real applications are combinatorial by nature. As for continuous problems, a study has been performed to apply some machine learning strategies. Instead of approximating the lower-level solution value, new approximation algorithms for the discrete/combinatorial case have been designed. Using the principle employed in GP hyper-heuristics, heuristics are trained in order to tackle efficiently the $\mathcal{NP}-hard$ lower-level of bi-level problems. This automatic generation of heuristics permits to break the nested structure into two separated phases: \emph{training lower-level heuristics} and \emph{solving the upper-level problem with the new heuristics}. At this occasion, a second modeling contribution has been introduced through a novel large-scale and mixed-integer bi-level problem dealing with pricing in the cloud, i.e., the Bi-level Cloud Pricing Optimization Problem (BCPOP). After a series of experiments that consisted in training heuristics on various lower-level instances of the BCPOP and using them to tackle the bi-level problem itself, the obtained results are compared to the ``cooperative coevolutionary algorithm for bi-level optimization'' (COBRA). Although training heuristics enables to \emph{break the nested structure}, a two phase optimization is still required. Therefore, the emphasis has been put on training heuristics while optimizing the upper-level problem using competitive co-evolution. Instead of adopting the classical decomposition scheme as done by COBRA which suffers from the strong epistatic links between lower-level and upper-level variables, co-evolving the solution and the mean to get to it can cope with these epistatic link issues. The ``CARBON'' algorithm developed in this thesis is a competitive and hybrid co-evolutionary algorithm designed for this purpose. In order to validate the potential of CARBON, numerical experiments have been designed and results have been compared to state-of-the-art algorithms. These results demonstrate that ``CARBON'' makes possible to address nested optimization efficiently. [less ▲] Detailed reference viewed: 271 (29 UL)![]() ; Kieffer, Emmanuel ![]() ![]() in Foundations of Computing and Decision Sciences (2019) Detailed reference viewed: 118 (21 UL)![]() Kieffer, Emmanuel ![]() ![]() ![]() in 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) (2018, May 25) Detailed reference viewed: 228 (26 UL) |
||