Results 1-20 of 44.
pinel
![]() Varrette, Sébastien ![]() ![]() in 21st IEEE Intl. Symp. on Parallel and Distributed Computing (ISPDC'22) (2022, July) High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such ... [more ▼] High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such large scale and distributed computing infrastructures is left to a Resource and Job Management System (RJMS). This essential middleware component is responsible for managing the computing resources, handling user requests to allocate resources while providing an optimized framework for starting, executing and monitoring jobs on the allocated resources. The University of Luxembourg has been operating for 15 years a large academic HPC facility which relies since 2017 on the Slurm RJMS introduced on top of the flagship cluster Iris. The acquisition of a new liquid-cooled supercomputer named Aion which was released in 2021 was the occasion to deeply review and optimize the seminal Slurm configuration, the resource limits defined and the sustaining fairsharing algorithm. This paper presents the outcomes of this study and details the implemented RJMS policy. The impact of the decisions made over the supercomputers workloads is also described. In particular, the performance evaluation conducted highlights that when compared to the seminal configuration, the described and implemented environment brought concrete and measurable improvements with regards the platform utilization (+12.64%), the jobs efficiency (as measured by the average Wall-time Request Accuracy, improved by 110.81%) or the management and funding (increased by 10%). The systems demonstrated sustainable and scalable HPC performances, and this effort has led to a negligible penalty on the average slowdown metric (response time normalized by runtime), which was increased by 0.59% for job workloads covering a complete year of exercise. Overall, this new setup has been in production for 18 months on both supercomputers and the updated model proves to bring a fairer and more satisfying experience to the end users. The proposed configurations and policies may help other HPC centres when designing or improving the RJMS sustaining the job scheduling strategy at the advent of computing capacity expansions. [less ▲] Detailed reference viewed: 70 (11 UL)![]() Talbot, Pierre ![]() ![]() ![]() in Proceedings of the AAAI Conference on Artificial Intelligence (2022, June), 36(4), 3830-3839 The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage ... [more ▼] The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage yet of GPU parallelism. One reason is that constraint solvers were primarily designed within the mental frame of sequential computation. To solve this issue, we take a step back and contribute to a simple, intrinsically parallel, lock-free and formally correct programming language based on concurrent constraint programming. We then re-examine parallel constraint solving on GPUs within this formalism, and develop Turbo, a simple constraint solver entirely programmed on GPUs. Turbo validates the correctness of our approach and compares positively to a parallel CPU-based solver. [less ▲] Detailed reference viewed: 59 (7 UL)![]() Varrette, Sébastien ![]() ![]() ![]() in ACM Practice and Experience in Advanced Research Computing (PEARC'21) (2021, July) High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large ... [more ▼] High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large-scale Big Data analytic capabilities. The efficient exploitation of heterogeneous computing resources featuring different processor architectures and generations, coupled with the eventual presence of GPU accelerators, remains a challenge. The University of Luxembourg operates since 2007 a large academic HPC facility which remains one of the reference implementation within the country and offers a cutting-edge research infrastructure to Luxembourg public research. The HPC support team invests a significant amount of time (i.e., several months of effort per year) in providing a software environment optimised for hundreds of users, but the complexity of HPC software was quickly outpacing the capabilities of classical software management tools. Since 2014, our scientific software stack is generated and deployed in an automated and consistent way through the RESIF framework, a wrapper on top of Easybuild and Lmod [5] meant to efficiently handle user software generation. A large code refactoring was performed in 2017 to better handle different software sets and roles across multiple clusters, all piloted through a dedicated control repository. With the advent in 2020 of a new supercomputer featuring a different CPU architecture, and to mitigate the identified limitations of the existing framework, we report in this state-of-practice article RESIF 3.0, the latest iteration of our scientific software management suit now relying on streamline Easybuild. It permitted to reduce by around 90% the number of custom configurations previously enforced by specific Slurm and MPI settings, while sustaining optimised builds coexisting for different dimensions of CPU and GPU architectures. The workflow for contributing back to the Easybuild community was also automated and a current work in progress aims at drastically decrease the building time of a complete software set generation. Overall, most design choices for our wrapper have been motivated by several years of experience in addressing in a flexible and convenient way the heterogeneous needs inherent to an academic environment aiming for research excellence. As the code base is available publicly, and as we wish to transparently report also the pitfalls and difficulties met, this tool may thus help other HPC centres to consolidate their own software management stack. [less ▲] Detailed reference viewed: 342 (39 UL)![]() ; Pinel, Frederic ![]() ![]() in Proceedings of International Conference on Agents and Artificial Intelligence (ICAART) (2021, February 05) Elementary cellular automata (ECA) are simple dynamic systems which display complex behaviour from simple local interactions. The complex behaviour is apparent in the two-dimensional temporal evolution of ... [more ▼] Elementary cellular automata (ECA) are simple dynamic systems which display complex behaviour from simple local interactions. The complex behaviour is apparent in the two-dimensional temporal evolution of a cellular automata, which can be viewed as an image composed of black and white pixels. The visual patterns within these images inspired several ECA classifications, aimed at matching the automatas’ properties to observed patterns, visual or statistical. In this paper, we quantitatively compare 11 ECA classifications. In contrast to the a priori logic behind a classification, we propose an a posteriori evaluation of a classification. The evaluation employs a convolutional neural network, trained to classify each ECA to its assigned class in a classification. The prediction accuracy indicates how well the convolutional neural network is able to learn the underlying classification logic, and reflects how well this classification logic clusters patterns in the temporal evolution. Results show different prediction accuracy (yet all above 85%), three classifications are very well captured by our simple convolutional neural network (accuracy above 99%), although trained on a small extract from the temporal evolution, and with little observations (100 per ECA, evolving 513 cells). In addition, we explain an unreported ”pathological” behaviour in two ECAs. [less ▲] Detailed reference viewed: 135 (8 UL)![]() Kieffer, Emmanuel ![]() ![]() in Springer CCIS series (2021) Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty ... [more ▼] Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty and illiquidity. Maintaining a specific target allocation is therefore a tedious and critical task. Unfortunately, recommitment strategies are still manually designed and few works in the literature have endeavored to develop a recommitment system balancing opportunity cost and risk of default. Due to its strong similarities to a control system, we propose to “learn how to recommit” with Reinforcement Learning (RL) and, more specifically, using Proximal Policy Optimisation (PPO). To the best of our knowledge, this is the first attempt a RL algorithm is applied to private equity with the aim to solve the recommitment problematic. After training the RL model on simulated portfolios, the resulting recommitment policy is compared to state-of-the-art strategies. Numerical results suggest that the trained policy can achieve high target allocation while bounding the risk of being overinvested. [less ▲] Detailed reference viewed: 108 (6 UL)![]() Kieffer, Emmanuel ![]() ![]() in 2021 IEEE Symposium Series on Computational Intelligence (SSCI) (2021) Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of ... [more ▼] Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of becoming a defaulting investor. When looking at recommitments we are quickly faced with a combinatorial explosion of the solution space, rendering explicit enumeration impossible. As a consequence, manual management if any is becoming time-consuming and error-prone. For this reason, investors need guidance and decision aid algorithms producing reliable, robust and trustworthy recommitment strategies. In this work, we propose to generate automatically recommitment strategies based on the evolution of symbolic expressions to provide clear and understandable decision rules to Private Equity experts and investors. To the best of our knowledge, this is the first time a methodology to learn recommitment strategies using evolutionary learning is proposed. Experiments demonstrate the capacity of the proposed approach to generate efficient and robust strategies, keeping a high degree of investment while bounding the risk of being overinvested. [less ▲] Detailed reference viewed: 110 (4 UL)![]() ; Varrette, Sébastien ![]() ![]() in 20th IEEE/ACM Intl. Symp. on Cluster, Cloud and Internet Computing (CCGrid'20) (2020, May) With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More ... [more ▼] With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More recently, a subset of ML known as Deep Learning (DL) raised an increased interest due to its inherent ability to tackle efficiently novel cognitive computing applications. DL allows computational models that are composed of multiple processing layers to learn in an automated way representations of data with multiple levels of abstraction, and can deliver higher predictive accuracy when trained on larger data sets. Based on Artificial Neural Networks (ANN), DL is now at the core of state of the art voice recognition systems (which enable easy control over e.g. Internet-of- Things (IoT) smart home appliances for instance), self-driving car engine, online recommendation systems. The ecosystem of DL frameworks is fast evolving, as well as the DL architectures that are shown to perform well on specialized tasks and to exploit GPU accelerators. For this reason, the frequent performance evaluation of the DL ecosystem is re- quired, especially since the advent of novel distributed training frameworks such as Horovod allowing for scalable training across multiple computing resources. In this paper, the scalability evaluation of the reference DL frameworks (Tensorflow, Keras, MXNet, and PyTorch) is performed over up-to-date High Performance Comput- ing (HPC) resources to compare the efficiency of differ- ent implementations across several hardware architectures (CPU and GPU). Experimental results demonstrate that the DistributedDataParallel features in the Pytorch library seem to be the most efficient framework for distributing the training process across many devices, allowing to reach a throughput speedup of 10.11 when using 12 NVidia Tesla V100 GPUs when training Resnet44 on the CIFAR10 dataset. [less ▲] Detailed reference viewed: 171 (14 UL)![]() Pinel, Frédéric ![]() ![]() in Communications in Computer and Information Science (2020, February) We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be ... [more ▼] We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be embedded in the scheduler of a shared GPU infrastructure, capable of providing estimated training times for a wide range of network architectures, when the user submits a training job. To this end, a very short and simple representation for a given DNN is chosen. In order to compensate for the limited degree of description of the basic network representation, a novel co-evolutionary approach is taken to fit the estimator. The training set for the estimator, i.e. DNNs, is evolved by an evolutionary algorithm that optimizes the accuracy of the estimator. In the process, the genetic algorithm evolves DNNs, generates Python-Keras programs and projects them onto the simple representation. The genetic operators are dynamic, they change with the estimator’s accuracy in order to balance accuracy with generalization. Results show that despite the low degree of information in the representation and the simple initial design for the predictor, co-evolving the training set performs better than near random generated population of DNNs. [less ▲] Detailed reference viewed: 167 (15 UL)![]() Thanapol, Panissara ![]() ![]() in 5th International Conference on Information Technology, Bangsaen 21-22 October 2020 (2020) Detailed reference viewed: 111 (11 UL)![]() Varrette, Sébastien ![]() ![]() ![]() in Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019) (2019, December) For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to ... [more ▼] For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the num- ber of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmark- ing application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures. [less ▲] Detailed reference viewed: 218 (53 UL)![]() ; Pinel, Frédéric ![]() in 2017 3rd IEEE International Conference on Cybernetics (CYBCONF) (2017) Detailed reference viewed: 49 (3 UL)![]() Narayanasamy, Shaman ![]() ![]() ![]() in Genome Biology (2016), 17 Existing workflows for the analysis of multi-omic microbiome datasets are lab-specific and often result in sub-optimal data usage. Here we present IMP, a reproducible and modular pipeline for the ... [more ▼] Existing workflows for the analysis of multi-omic microbiome datasets are lab-specific and often result in sub-optimal data usage. Here we present IMP, a reproducible and modular pipeline for the integrated and reference-independent analysis of coupled metagenomic and metatranscriptomic data. IMP incorporates robust read preprocessing, iterative co-assembly, analyses of microbial community structure and function, automated binning, as well as genomic signature-based visualizations. The IMP-based data integration strategy enhances data usage, output volume, and output quality as demonstrated using relevant use-cases. Finally, IMP is encapsulated within a user-friendly implementation using Python and Docker. IMP is available at http://r3lab.uni.lu/web/imp/ (MIT license). [less ▲] Detailed reference viewed: 386 (25 UL)![]() ![]() Bouvry, Pascal ![]() in Pierson, J.-M. (Ed.) Large-scale Distributed Systems and Energy Efficiency: A Holistic View (2015) Detailed reference viewed: 187 (5 UL)![]() Muller, Emilie ![]() ![]() in Nature Communications (2014) Microbial communities are complex and dynamic systems that are primarily structured according to their members’ ecological niches. To investigate how niche breadth (generalist versus specialist lifestyle ... [more ▼] Microbial communities are complex and dynamic systems that are primarily structured according to their members’ ecological niches. To investigate how niche breadth (generalist versus specialist lifestyle strategies) relates to ecological success, we develop and apply an integrative workflow for the multi-omic analysis of oleaginous mixed microbial communities from a biological wastewater treatment plant. Time- and space-resolved coupled metabolomic and taxonomic analyses demonstrate that the community-wide lipid accumulation phenotype is associated with the dominance of the generalist bacterium Candidatus Microthrix spp. By integrating population-level genomic reconstructions (reflecting fundamental niches) with transcriptomic and proteomic data (realised niches), we identify finely tuned gene expression governing resource usage by Candidatus Microthrix parvicella over time. Moreover, our results indicate that the fluctuating environmental conditions constrain the accumulation of genetic variation in Candidatus Microthrix parvicella likely due to fitness trade-offs. Based on our observations, niche breadth has to be considered as an important factor for understanding the evolutionary processes governing (microbial) population sizes and structures in situ. [less ▲] Detailed reference viewed: 409 (33 UL)![]() Pinel, Frédéric ![]() Doctoral thesis (2014) Detailed reference viewed: 159 (15 UL)![]() Laczny, Cedric Christian ![]() ![]() in Scientific Reports (2014) The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization ... [more ▼] The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization method should, among others, enable clear distinction of congruent groups of sequences of closely related taxa, be applicable to fragments of lengths typically achievable following assembly, and allow the efficient analysis of the growing amounts of community genomic sequence data. Here, we report a scalable approach for the visualization of metagenomic data that is based on nonlinear dimension reduction via Barnes-Hut Stochastic Neighbor Embedding of centered log-ratio transformed oligonucleotide signatures extracted from assembled genomic sequence fragments. The approach allows for alignment-free assessment of the data-inherent taxonomic structure, and it can potentially facilitate the downstream binning of genomic fragments into uniform clusters reflecting organismal origin. We demonstrate the performance of our approach by visualizing community genomic sequence data from simulated as well as groundwater, human-derived and marine microbial communities. [less ▲] Detailed reference viewed: 291 (22 UL)![]() Plugaru, Valentin ![]() ![]() ![]() Report (2014) The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient ... [more ▼] The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient architectures. This article compares and contrasts the performance and energy efficiency of two modern, traditional Intel Xeon and low power ARM-based clusters, which are tested with the recently developed High Performance Conjugate Gradient (HPCG) benchmark and the ABySS, FASTA and MrBayes bioinformatics applications. We show a higher Performance per Watt valuation of the ARM cluster, and lower energy usage during the tests, which does not offset the much faster job completion rate obtained by the Intel cluster, making the latter more suitable for the considered workloads given the disparity in the performance results. [less ▲] Detailed reference viewed: 200 (23 UL)![]() Pinel, Frédéric ![]() in International Journal of Hybrid Intelligent Systems (2014), 11(4), 287--302 Detailed reference viewed: 41 (2 UL)![]() Pinel, Frédéric ![]() ![]() ![]() in Wyrzykowski, Roman; Dongarra, Jack (Eds.) Parallel Processing and Applied Mathematics 10th International Conference, PPAM 2013 Warsaw, Poland, September 8–11, 2013 (2014) Detailed reference viewed: 105 (2 UL)![]() Muller, Emilie ![]() ![]() Poster (2014) Microbial communities are complex and dynamic systems that are influenced by stochastic-neutral processes but are mainly structured by resource availability and usage. High-resolution “meta-omics” offer ... [more ▼] Microbial communities are complex and dynamic systems that are influenced by stochastic-neutral processes but are mainly structured by resource availability and usage. High-resolution “meta-omics” offer exciting prospects to investigate microbial populations in their native environment. In particular, integrated meta-omics, by allowing simultaneous resolution of fundamental niches (genomics) and realised niches (transcriptomics, proteomics and metabolomics), can resolve microbial lifestyles strategies (generalist versus specialist) in situ. We have recently developed the necessary wet- and dry-lab methodologies to carry out systematic molecular measurements of microbial consortia over space and time, and to integrate and analyse the resulting data at the population-level. We applied these methods to oleaginous mixed microbial communities located on the surface of anoxic biological wastewater treatment tanks to investigate how niche breadth (generalist versus specialist strategies) relates to community-level phenotypes and ecological success (i.e. population size). Coupled metabolomics and 16S rRNA gene-based deep sequencing demonstrate that the community-wide lipid accumulation phenotype is associated with the dominance of Candidatus Microthrix parvicella. By integrating population-level genomic reconstructions with transcriptomic and proteomic data, we found that the dominance of this microbial generalist population results from finely tuned resource usage and optimal foraging behaviour. Moreover, the fluctuating environmental conditions constrain the accumulation of variations, leading to a genetically homogeneous population likely due to fitness trade-offs. By integrating metagenomic, metatranscriptomic, metaproteomic and metabolomic information, we demonstrate that natural microbial population sizes and structures are intricately linked to resource usage and that differing microbial lifestyle strategies may explain the varying degrees of within-population genetic heterogeneity observed in metagenomic datasets. Elucidating the exact mechanism driving fitness trade-offs, e.g., antagonistic pleiotropy or others, will require additional integrated omic datasets to be generated from samples taken over space and time. Based on our observations, niche breadth and lifestyle strategies (generalists versus specialists) have to be considered as important factors for understanding the evolutionary processes governing microbial population sizes and structures in situ. [less ▲] Detailed reference viewed: 209 (12 UL) |
||