[en] For large scale systems, such as data centers, energy efficiency has proven to be key for reducing capital, operational expenses and environmental impact. Power drainage of a system is closely related to the type and characteristics of workload that the device is running. For this reason, this paper presents an automatic software tuning method for parallel program generation able to adapt and exploit the hardware features available on a target computing system such as an HPC facility or a cloud system in a better way than traditional compiler infrastructures. We propose a search based approach combining both exact methods and approximated heuristics evolving programs in order to find optimized configurations relying on an ever-increasing number of tunable knobs i.e., code transformation and execution options (such as the num- ber of OpenMP threads and/or the CPU frequency settings). The main objective is to outperform the configurations generated by traditional compiling infrastructures for selected KPIs i.e., performance, energy and power usage (for both for the CPU and DRAM), as well as the runtime. First experimental results tied to the local optimization phase of the proposed framework are encouraging, demonstrating between 8% and 41% improvement for all considered metrics on a reference benchmark- ing application (i.e., Linpack). This brings novel perspectives for the global optimization step currently under investigation within the presented framework, with the ambition to pave the way toward automatic tuning of energy-aware applications beyond the performance of the current state-of-the-art compiler infrastructures.
Centre de recherche :
ULHPC - University of Luxembourg: High Performance Computing
Disciplines :
Sciences informatiques
Auteur, co-auteur :
VARRETTE, Sébastien ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
PINEL, Frédéric ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
KIEFFER, Emmanuel ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
DANOY, Grégoire ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Computer Science and Communications Research Unit (CSC)
BOUVRY, Pascal ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Automatic Software Tuning of Parallel Programs for Energy-Aware Executions
Date de publication/diffusion :
décembre 2019
Nom de la manifestation :
Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019)
Lieu de la manifestation :
Bialystok, Pologne
Date de la manifestation :
September 8-11, 2019
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proc. of 13th Intl. Conf. on Parallel Processing and Applied Mathematics (PPAM 2019)
Agakov, F., et al.: Using machine learning to focus iterative optimization. In: International Symposium on Code Generation & Optimization (CGO 2006), pp. 295– 305 (2006)
Carrillo, V.M., Taboada, H.: A post-pareto approach for multi-objective decision making using a non-uniform weight generator method, vol. 12, pp. 116–121 (2012)
Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach. IEEE Trans. Evol. Comput. 18(4), 577–601 (2014)
Desrochers, S., Paradis, C., Weaver, V.: A validation of DRAM RAPL power measurements. In: Proceedings of the 2nd International Symposium on Memory Systems (MEMSYS 2016), pp. 455–470 (2016)
Dongarra, J.J., Moler, C.B., Bunch, J.R., Stewart, G.W.: LINPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1979)
Durillo, J.J., Fahringer, T.: From single-to multi-objective auto-tuning of programs: advantages and implications. Sci. Program. 22, 285–297 (2014). https://doi.org/10.1155/2014/818579
Khan, K.N., Hirki, M., Niemi, T., Nurminen, J.K., Ou, Z.: RAPL in action: experiences in using RAPL for power measurements. TOMPECS 3(2), 1–26 (2018)
Kieffer, E., Danoy, G., Bouvry, P., Nagih, A.: Bayesian optimization approach of general bi-level problems. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO 2017), pp. 1614–1621 (2017)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO 2004), Palo Alto, California, March 2004
Naono, K., Teranishi, K., Cavazos, J., Suda, R.: Software Automatic Tuning (From Concepts to State-of-the-Art Results). Springer, New York (2010). https://doi.org/10.1007/978-1-4419-6935-4
Varrette, S., Bouvry, P., Cartiaux, H., Georgatos, F.: Management of an academic HPC cluster: the UL experience. In: International Conference on High Performance Computing & Simulation (HPCS 2014), Bologna, Italy, pp. 959–967. IEEE, July 2014