Performance Analysis; Roofline Model; Weather Forecast; Deep Learning Benchmark
Résumé :
[en] We are presenting here a detailed analysis and performance characterization of a statistical temperature downscaling application used in the MAELSTROM EuroHPC project. This application uses a deep learning methodology to convert low-resolution atmospheric temperature states into high-resolution. We have performed in-depth profiling and roofline analysis at different levels (Operators, Training, Distributed Training, Inference) of the downscaling model on different hardware architectures (Nvidia V100 & A100 GPUs). Finally, we compare the training and inference cost of the downscaling model with various cloud providers. Our results identify the model bottlenecks which can be used to enhance the model architecture and determine hardware configuration for efficiently utilizing the HPC. Furthermore, we provide a comprehensive methodology for in-depth profiling and benchmarking of the deep learning models.
Centre de recherche :
- Interdisciplinary Centre for Security, Reliability and Trust (SnT) > SEDAN - Service and Data Management in Distributed Systems
Disciplines :
Sciences informatiques
Auteur, co-auteur :
PANNER SELVAM, Karthick ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN
BRORSSON, Mats Hakan ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN
Co-auteurs externes :
no
Langue du document :
Anglais
Titre :
Performance Analysis and Benchmarking of a Temperature Downscaling Deep Learning Model
Date de publication/diffusion :
mars 2023
Nom de la manifestation :
31st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
Lieu de la manifestation :
Naples, Italie
Date de la manifestation :
01-03-2023 to 03-03-2023
Manifestation à portée :
International
Titre de l'ouvrage principal :
31st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, Naples, Italy 1-3 March 2023
Peer reviewed :
Peer reviewed
Focus Area :
Security, Reliability and Trust
Projet FnR :
FNR15092355 - Machine Learning For Scalable Meteorology And Climate, 2020 (01/04/2021-31/03/2024) - Mats Brorsson