Results 1-20 of 44.
pinel

Bookmark and Share    
Full Text
Peer Reviewed
See detailOptimizing the Resource and Job Management System of an Academic HPC and Research Computing Facility
Varrette, Sébastien UL; Kieffer, Emmanuel UL; Pinel, Frederic

in 21st IEEE Intl. Symp. on Parallel and Distributed Computing (ISPDC'22) (2022, July)

High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such ... [more ▼]

High Performance Computing (HPC) is nowadays a strategic asset required to sustain the surging demands for massive processing and data-analytic capabilities. In practice, the effective management of such large scale and distributed computing infrastructures is left to a Resource and Job Management System (RJMS). This essential middleware component is responsible for managing the computing resources, handling user requests to allocate resources while providing an optimized framework for starting, executing and monitoring jobs on the allocated resources. The University of Luxembourg has been operating for 15 years a large academic HPC facility which relies since 2017 on the Slurm RJMS introduced on top of the flagship cluster Iris. The acquisition of a new liquid-cooled supercomputer named Aion which was released in 2021 was the occasion to deeply review and optimize the seminal Slurm configuration, the resource limits defined and the sustaining fairsharing algorithm. This paper presents the outcomes of this study and details the implemented RJMS policy. The impact of the decisions made over the supercomputers workloads is also described. In particular, the performance evaluation conducted highlights that when compared to the seminal configuration, the described and implemented environment brought concrete and measurable improvements with regards the platform utilization (+12.64%), the jobs efficiency (as measured by the average Wall-time Request Accuracy, improved by 110.81%) or the management and funding (increased by 10%). The systems demonstrated sustainable and scalable HPC performances, and this effort has led to a negligible penalty on the average slowdown metric (response time normalized by runtime), which was increased by 0.59% for job workloads covering a complete year of exercise. Overall, this new setup has been in production for 18 months on both supercomputers and the updated model proves to bring a fairer and more satisfying experience to the end users. The proposed configurations and policies may help other HPC centres when designing or improving the RJMS sustaining the job scheduling strategy at the advent of computing capacity expansions. [less ▲]

Detailed reference viewed: 70 (11 UL)
Full Text
Peer Reviewed
See detailA Variant of Concurrent Constraint Programming on GPU
Talbot, Pierre UL; Pinel, Frederic UL; Bouvry, Pascal UL

in Proceedings of the AAAI Conference on Artificial Intelligence (2022, June), 36(4), 3830-3839

The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage ... [more ▼]

The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage yet of GPU parallelism. One reason is that constraint solvers were primarily designed within the mental frame of sequential computation. To solve this issue, we take a step back and contribute to a simple, intrinsically parallel, lock-free and formally correct programming language based on concurrent constraint programming. We then re-examine parallel constraint solving on GPUs within this formalism, and develop Turbo, a simple constraint solver entirely programmed on GPUs. Turbo validates the correctness of our approach and compares positively to a parallel CPU-based solver. [less ▲]

Detailed reference viewed: 59 (7 UL)
Full Text
Peer Reviewed
See detailRESIF 3.0: Toward a Flexible & Automated Management of User Software Environment on HPC facility
Varrette, Sébastien UL; Kieffer, Emmanuel UL; Pinel, Frederic UL et al

in ACM Practice and Experience in Advanced Research Computing (PEARC'21) (2021, July)

High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large ... [more ▼]

High Performance Computing (HPC) is increasingly identified as a strategic asset and enabler to accelerate the research and the business performed in all areas requiring intensive computing and large-scale Big Data analytic capabilities. The efficient exploitation of heterogeneous computing resources featuring different processor architectures and generations, coupled with the eventual presence of GPU accelerators, remains a challenge. The University of Luxembourg operates since 2007 a large academic HPC facility which remains one of the reference implementation within the country and offers a cutting-edge research infrastructure to Luxembourg public research. The HPC support team invests a significant amount of time (i.e., several months of effort per year) in providing a software environment optimised for hundreds of users, but the complexity of HPC software was quickly outpacing the capabilities of classical software management tools. Since 2014, our scientific software stack is generated and deployed in an automated and consistent way through the RESIF framework, a wrapper on top of Easybuild and Lmod [5] meant to efficiently handle user software generation. A large code refactoring was performed in 2017 to better handle different software sets and roles across multiple clusters, all piloted through a dedicated control repository. With the advent in 2020 of a new supercomputer featuring a different CPU architecture, and to mitigate the identified limitations of the existing framework, we report in this state-of-practice article RESIF 3.0, the latest iteration of our scientific software management suit now relying on streamline Easybuild. It permitted to reduce by around 90% the number of custom configurations previously enforced by specific Slurm and MPI settings, while sustaining optimised builds coexisting for different dimensions of CPU and GPU architectures. The workflow for contributing back to the Easybuild community was also automated and a current work in progress aims at drastically decrease the building time of a complete software set generation. Overall, most design choices for our wrapper have been motivated by several years of experience in addressing in a flexible and convenient way the heterogeneous needs inherent to an academic environment aiming for research excellence. As the code base is available publicly, and as we wish to transparently report also the pitfalls and difficulties met, this tool may thus help other HPC centres to consolidate their own software management stack. [less ▲]

Detailed reference viewed: 342 (39 UL)
Full Text
Peer Reviewed
See detailComparing elementary cellular automata classifications with a convolutional neural network
Comelli, Thibaud; Pinel, Frederic UL; Bouvry, Pascal UL

in Proceedings of International Conference on Agents and Artificial Intelligence (ICAART) (2021, February 05)

Elementary cellular automata (ECA) are simple dynamic systems which display complex behaviour from simple local interactions. The complex behaviour is apparent in the two-dimensional temporal evolution of ... [more ▼]

Elementary cellular automata (ECA) are simple dynamic systems which display complex behaviour from simple local interactions. The complex behaviour is apparent in the two-dimensional temporal evolution of a cellular automata, which can be viewed as an image composed of black and white pixels. The visual patterns within these images inspired several ECA classifications, aimed at matching the automatas’ properties to observed patterns, visual or statistical. In this paper, we quantitatively compare 11 ECA classifications. In contrast to the a priori logic behind a classification, we propose an a posteriori evaluation of a classification. The evaluation employs a convolutional neural network, trained to classify each ECA to its assigned class in a classification. The prediction accuracy indicates how well the convolutional neural network is able to learn the underlying classification logic, and reflects how well this classification logic clusters patterns in the temporal evolution. Results show different prediction accuracy (yet all above 85%), three classifications are very well captured by our simple convolutional neural network (accuracy above 99%), although trained on a small extract from the temporal evolution, and with little observations (100 per ECA, evolving 513 cells). In addition, we explain an unreported ”pathological” behaviour in two ECAs. [less ▲]

Detailed reference viewed: 135 (8 UL)
Full Text
Peer Reviewed
See detailProximal Policy Optimisation for a Private Equity Recommitment System
Kieffer, Emmanuel UL; Pinel, Frederic UL; Meyer, Thomas et al

in Springer CCIS series (2021)

Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty ... [more ▼]

Recommitments are essential for limited partner investors to maintain a target exposure to private equity. However, recommitting to new funds is irrevocable and expose investors to cashflow uncertainty and illiquidity. Maintaining a specific target allocation is therefore a tedious and critical task. Unfortunately, recommitment strategies are still manually designed and few works in the literature have endeavored to develop a recommitment system balancing opportunity cost and risk of default. Due to its strong similarities to a control system, we propose to “learn how to recommit” with Reinforcement Learning (RL) and, more specifically, using Proximal Policy Optimisation (PPO). To the best of our knowledge, this is the first attempt a RL algorithm is applied to private equity with the aim to solve the recommitment problematic. After training the RL model on simulated portfolios, the resulting recommitment policy is compared to state-of-the-art strategies. Numerical results suggest that the trained policy can achieve high target allocation while bounding the risk of being overinvested. [less ▲]

Detailed reference viewed: 108 (6 UL)
Full Text
Peer Reviewed
See detailEvolutionary Learning of Private Equity Recommitment Strategies
Kieffer, Emmanuel UL; Pinel, Frederic UL; Meyer, Thomas et al

in 2021 IEEE Symposium Series on Computational Intelligence (SSCI) (2021)

Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of ... [more ▼]

Achieving and maintaining high allocations to Private Equity and keeping allocations at the targeted level through recommitment strategies is a complex task which needs to be balanced against the risk of becoming a defaulting investor. When looking at recommitments we are quickly faced with a combinatorial explosion of the solution space, rendering explicit enumeration impossible. As a consequence, manual management if any is becoming time-consuming and error-prone. For this reason, investors need guidance and decision aid algorithms producing reliable, robust and trustworthy recommitment strategies. In this work, we propose to generate automatically recommitment strategies based on the evolution of symbolic expressions to provide clear and understandable decision rules to Private Equity experts and investors. To the best of our knowledge, this is the first time a methodology to learn recommitment strategies using evolutionary learning is proposed. Experiments demonstrate the capacity of the proposed approach to generate efficient and robust strategies, keeping a high degree of investment while bounding the risk of being overinvested. [less ▲]

Detailed reference viewed: 110 (4 UL)
Full Text
Peer Reviewed
See detailPerformance Analysis of Distributed and Scalable Deep Learning
Mahon, S.; Varrette, Sébastien UL; Plugaru, Valentin UL et al

in 20th IEEE/ACM Intl. Symp. on Cluster, Cloud and Internet Computing (CCGrid'20) (2020, May)

With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More ... [more ▼]

With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More recently, a subset of ML known as Deep Learning (DL) raised an increased interest due to its inherent ability to tackle efficiently novel cognitive computing applications. DL allows computational models that are composed of multiple processing layers to learn in an automated way representations of data with multiple levels of abstraction, and can deliver higher predictive accuracy when trained on larger data sets. Based on Artificial Neural Networks (ANN), DL is now at the core of state of the art voice recognition systems (which enable easy control over e.g. Internet-of- Things (IoT) smart home appliances for instance), self-driving car engine, online recommendation systems. The ecosystem of DL frameworks is fast evolving, as well as the DL architectures that are shown to perform well on specialized tasks and to exploit GPU accelerators. For this reason, the frequent performance evaluation of the DL ecosystem is re- quired, especially since the advent of novel distributed training frameworks such as Horovod allowing for scalable training across multiple computing resources. In this paper, the scalability evaluation of the reference DL frameworks (Tensorflow, Keras, MXNet, and PyTorch) is performed over up-to-date High Performance Comput- ing (HPC) resources to compare the efficiency of differ- ent implementations across several hardware architectures (CPU and GPU). Experimental results demonstrate that the DistributedDataParallel features in the Pytorch library seem to be the most efficient framework for distributing the training process across many devices, allowing to reach a throughput speedup of 10.11 when using 12 NVidia Tesla V100 GPUs when training Resnet44 on the CIFAR10 dataset. [less ▲]

Detailed reference viewed: 171 (14 UL)
Full Text
Peer Reviewed
See detailEvolving a Deep Neural Network Training Time Estimator
Pinel, Frédéric UL; Yin, Jian-xiong; Hundt, Christian UL et al

in Communications in Computer and Information Science (2020, February)

We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be ... [more ▼]

We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be embedded in the scheduler of a shared GPU infrastructure, capable of providing estimated training times for a wide range of network architectures, when the user submits a training job. To this end, a very short and simple representation for a given DNN is chosen. In order to compensate for the limited degree of description of the basic network representation, a novel co-evolutionary approach is taken to fit the estimator. The training set for the estimator, i.e. DNNs, is evolved by an evolutionary algorithm that optimizes the accuracy of the estimator. In the process, the genetic algorithm evolves DNNs, generates Python-Keras programs and projects them onto the simple representation. The genetic operators are dynamic, they change with the estimator’s accuracy in order to balance accuracy with generalization. Results show that despite the low degree of information in the representation and the simple initial design for the predictor, co-evolving the training set performs better than near random generated population of DNNs. [less ▲]

Detailed reference viewed: 167 (15 UL)
Full Text
Peer Reviewed
See detailReducing overfitting and improving generalization in training convolutional neural network under limited sample sizes in image recognition
Thanapol, Panissara UL; Lavangnananda, Kittichai; Bouvry, Pascal UL et al

in 5th International Conference on Information Technology, Bangsaen 21-22 October 2020 (2020)

Detailed reference viewed: 111 (11 UL)
Full Text
Peer Reviewed
See detailAutomatic Software Tuning of Parallel Programs for Energy-Aware Executions
Varrette, Sébastien UL; Pinel, Frédéric UL; Kieffer, Emmanuel UL et al

in Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019) (2019, December)

For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to ... [more ▼]

For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the num- ber of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmark- ing application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures. [less ▲]

Detailed reference viewed: 218 (53 UL)
Full Text
Peer Reviewed
See detailCombining machine learning and genetic algorithms to solve the independent tasks scheduling problem
Dorronsoro, Bernabé; Pinel, Frédéric UL

in 2017 3rd IEEE International Conference on Cybernetics (CYBCONF) (2017)

Detailed reference viewed: 49 (3 UL)
Full Text
Peer Reviewed
See detailIMP: a pipeline for reproducible referenceindependent integrated metagenomic and metatranscriptomic analyses
Narayanasamy, Shaman UL; Jarosz, Yohan UL; Muller, Emilie UL et al

in Genome Biology (2016), 17

Existing workflows for the analysis of multi-omic microbiome datasets are lab-specific and often result in sub-optimal data usage. Here we present IMP, a reproducible and modular pipeline for the ... [more ▼]

Existing workflows for the analysis of multi-omic microbiome datasets are lab-specific and often result in sub-optimal data usage. Here we present IMP, a reproducible and modular pipeline for the integrated and reference-independent analysis of coupled metagenomic and metatranscriptomic data. IMP incorporates robust read preprocessing, iterative co-assembly, analyses of microbial community structure and function, automated binning, as well as genomic signature-based visualizations. The IMP-based data integration strategy enhances data usage, output volume, and output quality as demonstrated using relevant use-cases. Finally, IMP is encapsulated within a user-friendly implementation using Python and Docker. IMP is available at http://r3lab.uni.lu/web/imp/ (MIT license). [less ▲]

Detailed reference viewed: 386 (25 UL)
Peer Reviewed
See detailEnergy Efficiency and High-Performance Computing
Bouvry, Pascal UL; Chetsa, G. L. T.; Costa, G. Da et al

in Pierson, J.-M. (Ed.) Large-scale Distributed Systems and Energy Efficiency: A Holistic View (2015)

Detailed reference viewed: 187 (5 UL)
Full Text
Peer Reviewed
See detailCommunity-integrated omics links dominance of a microbial generalist to fine-tuned resource usage
Muller, Emilie UL; Pinel, Nicolas; Laczny, Cedric Christian UL et al

in Nature Communications (2014)

Microbial communities are complex and dynamic systems that are primarily structured according to their members’ ecological niches. To investigate how niche breadth (generalist versus specialist lifestyle ... [more ▼]

Microbial communities are complex and dynamic systems that are primarily structured according to their members’ ecological niches. To investigate how niche breadth (generalist versus specialist lifestyle strategies) relates to ecological success, we develop and apply an integrative workflow for the multi-omic analysis of oleaginous mixed microbial communities from a biological wastewater treatment plant. Time- and space-resolved coupled metabolomic and taxonomic analyses demonstrate that the community-wide lipid accumulation phenotype is associated with the dominance of the generalist bacterium Candidatus Microthrix spp. By integrating population-level genomic reconstructions (reflecting fundamental niches) with transcriptomic and proteomic data (realised niches), we identify finely tuned gene expression governing resource usage by Candidatus Microthrix parvicella over time. Moreover, our results indicate that the fluctuating environmental conditions constrain the accumulation of genetic variation in Candidatus Microthrix parvicella likely due to fitness trade-offs. Based on our observations, niche breadth has to be considered as an important factor for understanding the evolutionary processes governing (microbial) population sizes and structures in situ. [less ▲]

Detailed reference viewed: 409 (33 UL)
Full Text
See detailEnergy-Performance Optimization for the Cloud
Pinel, Frédéric UL

Doctoral thesis (2014)

Detailed reference viewed: 159 (15 UL)
Full Text
Peer Reviewed
See detailAlignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction
Laczny, Cedric Christian UL; Pinel, Nicolás; Vlassis, Nikos UL et al

in Scientific Reports (2014)

The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization ... [more ▼]

The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization method should, among others, enable clear distinction of congruent groups of sequences of closely related taxa, be applicable to fragments of lengths typically achievable following assembly, and allow the efficient analysis of the growing amounts of community genomic sequence data. Here, we report a scalable approach for the visualization of metagenomic data that is based on nonlinear dimension reduction via Barnes-Hut Stochastic Neighbor Embedding of centered log-ratio transformed oligonucleotide signatures extracted from assembled genomic sequence fragments. The approach allows for alignment-free assessment of the data-inherent taxonomic structure, and it can potentially facilitate the downstream binning of genomic fragments into uniform clusters reflecting organismal origin. We demonstrate the performance of our approach by visualizing community genomic sequence data from simulated as well as groundwater, human-derived and marine microbial communities. [less ▲]

Detailed reference viewed: 291 (22 UL)
Full Text
See detailEvaluating the HPC Performance and Energy-Efficiency of Intel and ARM-based systems with synthetic and bioinformatics workloads
Plugaru, Valentin UL; Varrette, Sébastien UL; Pinel, Frédéric UL et al

Report (2014)

The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient ... [more ▼]

The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient architectures. This article compares and contrasts the performance and energy efficiency of two modern, traditional Intel Xeon and low power ARM-based clusters, which are tested with the recently developed High Performance Conjugate Gradient (HPCG) benchmark and the ABySS, FASTA and MrBayes bioinformatics applications. We show a higher Performance per Watt valuation of the ARM cluster, and lower energy usage during the tests, which does not offset the much faster job completion rate obtained by the Intel cluster, making the latter more suitable for the considered workloads given the disparity in the performance results. [less ▲]

Detailed reference viewed: 200 (23 UL)
Full Text
Peer Reviewed
See detailSavant: Automatic generation of a parallel scheduling heuristic for map-reduce
Pinel, Frédéric UL; Dorronsoro, Bernabé

in International Journal of Hybrid Intelligent Systems (2014), 11(4), 287--302

Detailed reference viewed: 41 (2 UL)
Full Text
Peer Reviewed
See detailIt's Not a Bug, It'sa Feature: Wait-free Asynchronous Cellular Genetic Algorithm
Pinel, Frédéric UL; Dorronsoro, Bernabé UL; Bouvry, Pascal UL et al

in Wyrzykowski, Roman; Dongarra, Jack (Eds.) Parallel Processing and Applied Mathematics 10th International Conference, PPAM 2013 Warsaw, Poland, September 8–11, 2013 (2014)

Detailed reference viewed: 105 (2 UL)
See detailCommunity integrated omics links the dominance of a microbial generalist to fine-tuned resource usage
Muller, Emilie UL; Pinel, Nicolás; Laczny, Cedric Christian UL et al

Poster (2014)

Microbial communities are complex and dynamic systems that are influenced by stochastic-neutral processes but are mainly structured by resource availability and usage. High-resolution “meta-omics” offer ... [more ▼]

Microbial communities are complex and dynamic systems that are influenced by stochastic-neutral processes but are mainly structured by resource availability and usage. High-resolution “meta-omics” offer exciting prospects to investigate microbial populations in their native environment. In particular, integrated meta-omics, by allowing simultaneous resolution of fundamental niches (genomics) and realised niches (transcriptomics, proteomics and metabolomics), can resolve microbial lifestyles strategies (generalist versus specialist) in situ. We have recently developed the necessary wet- and dry-lab methodologies to carry out systematic molecular measurements of microbial consortia over space and time, and to integrate and analyse the resulting data at the population-level. We applied these methods to oleaginous mixed microbial communities located on the surface of anoxic biological wastewater treatment tanks to investigate how niche breadth (generalist versus specialist strategies) relates to community-level phenotypes and ecological success (i.e. population size). Coupled metabolomics and 16S rRNA gene-based deep sequencing demonstrate that the community-wide lipid accumulation phenotype is associated with the dominance of Candidatus Microthrix parvicella. By integrating population-level genomic reconstructions with transcriptomic and proteomic data, we found that the dominance of this microbial generalist population results from finely tuned resource usage and optimal foraging behaviour. Moreover, the fluctuating environmental conditions constrain the accumulation of variations, leading to a genetically homogeneous population likely due to fitness trade-offs. By integrating metagenomic, metatranscriptomic, metaproteomic and metabolomic information, we demonstrate that natural microbial population sizes and structures are intricately linked to resource usage and that differing microbial lifestyle strategies may explain the varying degrees of within-population genetic heterogeneity observed in metagenomic datasets. Elucidating the exact mechanism driving fitness trade-offs, e.g., antagonistic pleiotropy or others, will require additional integrated omic datasets to be generated from samples taken over space and time. Based on our observations, niche breadth and lifestyle strategies (generalists versus specialists) have to be considered as important factors for understanding the evolutionary processes governing microbial population sizes and structures in situ. [less ▲]

Detailed reference viewed: 209 (12 UL)