References of "Pinel, Frederic 50026558"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailPerformance Analysis of Distributed and Scalable Deep Learning
Mahon, S.; Varrette, Sébastien UL; Plugaru, Valentin UL et al

in 20th IEEE/ACM Intl. Symp. on Cluster, Cloud and Internet Computing (CCGrid'20) (2020, May)

With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More ... [more ▼]

With renewed global interest for Artificial Intelligence (AI) methods, the past decade has seen a myriad of new programming models and tools that enable better and faster Machine Learning (ML). More recently, a subset of ML known as Deep Learning (DL) raised an increased interest due to its inherent ability to tackle efficiently novel cognitive computing applications. DL allows computational models that are composed of multiple processing layers to learn in an automated way representations of data with multiple levels of abstraction, and can deliver higher predictive accuracy when trained on larger data sets. Based on Artificial Neural Networks (ANN), DL is now at the core of state of the art voice recognition systems (which enable easy control over e.g. Internet-of- Things (IoT) smart home appliances for instance), self-driving car engine, online recommendation systems. The ecosystem of DL frameworks is fast evolving, as well as the DL architectures that are shown to perform well on specialized tasks and to exploit GPU accelerators. For this reason, the frequent performance evaluation of the DL ecosystem is re- quired, especially since the advent of novel distributed training frameworks such as Horovod allowing for scalable training across multiple computing resources. In this paper, the scalability evaluation of the reference DL frameworks (Tensorflow, Keras, MXNet, and PyTorch) is performed over up-to-date High Performance Comput- ing (HPC) resources to compare the efficiency of differ- ent implementations across several hardware architectures (CPU and GPU). Experimental results demonstrate that the DistributedDataParallel features in the Pytorch library seem to be the most efficient framework for distributing the training process across many devices, allowing to reach a throughput speedup of 10.11 when using 12 NVidia Tesla V100 GPUs when training Resnet44 on the CIFAR10 dataset. [less ▲]

Detailed reference viewed: 108 (7 UL)
Full Text
Peer Reviewed
See detailEvolving a Deep Neural Network Training Time Estimator
Pinel, Frédéric UL; Yin, Jian-xiong; Hundt, Christian UL et al

in Communications in Computer and Information Science (2020, February)

We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be ... [more ▼]

We present a procedure for the design of a Deep Neural Net- work (DNN) that estimates the execution time for training a deep neural network per batch on GPU accelerators. The estimator is destined to be embedded in the scheduler of a shared GPU infrastructure, capable of providing estimated training times for a wide range of network architectures, when the user submits a training job. To this end, a very short and simple representation for a given DNN is chosen. In order to compensate for the limited degree of description of the basic network representation, a novel co-evolutionary approach is taken to fit the estimator. The training set for the estimator, i.e. DNNs, is evolved by an evolutionary algorithm that optimizes the accuracy of the estimator. In the process, the genetic algorithm evolves DNNs, generates Python-Keras programs and projects them onto the simple representation. The genetic operators are dynamic, they change with the estimator’s accuracy in order to balance accuracy with generalization. Results show that despite the low degree of information in the representation and the simple initial design for the predictor, co-evolving the training set performs better than near random generated population of DNNs. [less ▲]

Detailed reference viewed: 62 (2 UL)
Full Text
Peer Reviewed
See detailAutomatic Software Tuning of Parallel Programs for Energy-Aware Executions
Varrette, Sébastien UL; Pinel, Frédéric UL; Kieffer, Emmanuel UL et al

in Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019) (2019, December)

For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to ... [more ▼]

For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the num- ber of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmark- ing application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures. [less ▲]

Detailed reference viewed: 107 (23 UL)
Full Text
Peer Reviewed
See detailCombining machine learning and genetic algorithms to solve the independent tasks scheduling problem
Dorronsoro, Bernabé; Pinel, Frédéric UL

in 2017 3rd IEEE International Conference on Cybernetics (CYBCONF) (2017)

Detailed reference viewed: 16 (2 UL)
Full Text
See detailEnergy-Performance Optimization for the Cloud
Pinel, Frédéric UL

Doctoral thesis (2014)

Detailed reference viewed: 141 (15 UL)
Full Text
See detailEvaluating the HPC Performance and Energy-Efficiency of Intel and ARM-based systems with synthetic and bioinformatics workloads
Plugaru, Valentin UL; Varrette, Sébastien UL; Pinel, Frédéric UL et al

Report (2014)

The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient ... [more ▼]

The increasing demand for High Performance Computing (HPC) paired with the higher power requirements of the ever-faster systems has led to the search for both performant and more energy-efficient architectures. This article compares and contrasts the performance and energy efficiency of two modern, traditional Intel Xeon and low power ARM-based clusters, which are tested with the recently developed High Performance Conjugate Gradient (HPCG) benchmark and the ABySS, FASTA and MrBayes bioinformatics applications. We show a higher Performance per Watt valuation of the ARM cluster, and lower energy usage during the tests, which does not offset the much faster job completion rate obtained by the Intel cluster, making the latter more suitable for the considered workloads given the disparity in the performance results. [less ▲]

Detailed reference viewed: 159 (23 UL)
Full Text
Peer Reviewed
See detailIt's Not a Bug, It'sa Feature: Wait-free Asynchronous Cellular Genetic Algorithm
Pinel, Frédéric UL; Dorronsoro, Bernabé UL; Bouvry, Pascal UL et al

in Wyrzykowski, Roman; Dongarra, Jack (Eds.) Parallel Processing and Applied Mathematics 10th International Conference, PPAM 2013 Warsaw, Poland, September 8–11, 2013 (2014)

Detailed reference viewed: 73 (2 UL)
Full Text
Peer Reviewed
See detailSavant: Automatic generation of a parallel scheduling heuristic for map-reduce
Pinel, Frédéric UL; Dorronsoro, Bernabé

in International Journal of Hybrid Intelligent Systems (2014), 11(4), 287--302

Detailed reference viewed: 20 (2 UL)
Full Text
Peer Reviewed
See detailComparing the Performance and Power Usage of GPU and ARM Clusters for Map-Reduce
Delplace, V.; Manneback, P.; Pinel, Frédéric UL et al

in Proc. of the 3rd Intl. Conf. on Cloud and Green Computing (CGC'13) (2013, October)

Detailed reference viewed: 125 (1 UL)
Full Text
Peer Reviewed
See detailSavant: Automatic parallelization of a scheduling heuristic with machine learning
Pinel, Frédéric UL; Dorronsoro, Bernabé UL; Bouvry, Pascal UL et al

in Nature and Biologically Inspired Computing (NaBIC), 2013 World Congress on (2013, August 13)

Detailed reference viewed: 123 (2 UL)
Full Text
Peer Reviewed
See detailSolving very large instances of the scheduling of independent tasks problem on the GPU
Pinel, Frédéric UL; Dorronsoro, Bernabé UL; Bouvry, Pascal UL

in Journal of Parallel & Distributed Computing (2013), 73(1), 101-110

Detailed reference viewed: 115 (4 UL)
Full Text
Peer Reviewed
See detailA survey on resource allocation in high performance distributed computing systems
Hussain, Hameed; Malik, Saif Ur Rehman; Hameed, Abdul et al

in Parallel Computing (2013), 39(11), 709-736

An efficient resource allocation is a fundamental requirement in high performance computing (HPC) systems. Many projects are dedicated to large-scale distributed computing systems that have designed and ... [more ▼]

An efficient resource allocation is a fundamental requirement in high performance computing (HPC) systems. Many projects are dedicated to large-scale distributed computing systems that have designed and developed resource allocation mechanisms with a variety of architectures and services. In our study, through analysis, a comprehensive survey for describing resource allocation in various HPCs is reported. The aim of the work is to aggregate under a joint framework, the existing solutions for HPC to provide a thorough analysis and characteristics of the resource management and allocation strategies. Resource allocation mechanisms and strategies play a vital role towards the performance improvement of all the HPCs classifications. Therefore, a comprehensive discussion of widely used resource allocation strategies deployed in HPC environment is required, which is one of the motivations of this survey. Moreover, we have classified the HPC systems into three broad categories, namely: (a) cluster, (b) grid, and (c) cloud systems and define the characteristics of each class by extracting sets of common attributes. All of the aforementioned systems are cataloged into pure software and hybrid/hardware solutions. The system classification is used to identify approaches followed by the implementation of existing resource allocation strategies that are widely presented in the literature. [less ▲]

Detailed reference viewed: 280 (7 UL)
Full Text
Peer Reviewed
See detailOptimisation of the enhanced distance based broadcasting protocol for MANETs
Ruiz, Patricia UL; Dorronsoro, Bernabé UL; Valentini, Giorgio et al

in Journal of Supercomputing (2012), 62(3), 1213-1240

Detailed reference viewed: 159 (2 UL)
Full Text
Peer Reviewed
See detailA Multi-Objective GRASP for Energy-aware Scheduling
Pecero, Johnatan UL; Pinel, Frédéric UL; Bouvry, Pascal UL et al

in International Congress on Computer Science Research (2011, October 28)

Detailed reference viewed: 67 (2 UL)
Full Text
Peer Reviewed
See detailAn overview of energy efficiency techniques in cluster computing systems
Valentini, Giorgio Luigi; Lassonde, Walter; Khan, Samee Ullah et al

in Cluster Computing (2011), 16(1), 3-15

Detailed reference viewed: 317 (2 UL)
Full Text
Peer Reviewed
See detailEnergy-efficient scheduling on milliclusters with performance constraints
Pinel, Frédéric UL; Pecero, Johnatan UL; Bouvry, Pascal UL et al

in Green Computing and Communications (GreenCom), 2011 IEEE/ACM International Conference on (2011)

Today’s datacenters and large scale enterprise com- puting are power hungry. A lot of research effort is devoted in industry and academy to address this challenging issue. In this context, a new type of ... [more ▼]

Today’s datacenters and large scale enterprise com- puting are power hungry. A lot of research effort is devoted in industry and academy to address this challenging issue. In this context, a new type of enterprise computing platform is being investigated. This computing platform is composed of hundred of millicomputers, each requiring orders of magnitude less power. However, this approach brings challenges that must be met in order to compete with the current practice. This paper addresses two such critical challenges. First, it suggests how to decompose large applications into smaller tasks, better suited to millicomputers. Then, it casts the performance oriented and energy efficient problem into a soft real-time scheduling problem, for which several algorithms are then proposed and evaluated. Sensitivity analysis is used to provide insights into the model, and plan the evaluation of the scheduling algorithms. The contention found in multi-core millicomputing processors is also accounted for. [less ▲]

Detailed reference viewed: 86 (1 UL)
Peer Reviewed
See detailEfficient Hierarchical Task Scheduling on GRIDS Accounting for Computation and Communications
Pecero, Johnatan UL; Pinel, Frédéric UL; Dorronsoro, Bernabé UL et al

in Bouvry, Pascal; González-Vélez, Horacio; Kolodziej, Joanna (Eds.) Intelligent Decision Systems in Large-Scale Distributed Environments, 362 (2011)

Detailed reference viewed: 114 (5 UL)
Full Text
Peer Reviewed
See detailA Review on Task Performance Prediction in Multi-core Based Systems
Pinel, Frédéric UL; Pecero, Johnatan UL; Bouvry, Pascal UL et al

in Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on (2011)

Detailed reference viewed: 96 (1 UL)
Full Text
Peer Reviewed
See detailEvolutionary Algorithm Parameter Tuning with Sensitivity Analysis
Pinel, Frédéric UL; Danoy, Grégoire UL; Bouvry, Pascal UL

in Security and Intelligent Information Systems (2011)

Detailed reference viewed: 138 (15 UL)