[en] We show performance results executing the FEniCS Project finite element
software on Amazon Web Services (AWS) c7g and c7gn instances with Graviton3
processors. Graviton3 processors implement the ARMv8.4-A instruction set and pro-
vide Scalable Vector Extensions (SVE) for Single Instruction Multiple Data (SIMD)
operations. Comparing clang 18 and GCC 13 series compilers for compiling an FFCx
generated high-order Laplace finite element kernel our results show that both com-
pilers emitted automatically vectorised loops with fused multiply-add instructions
on SVE registers. The AWS c7gn instances include a Elastic Fabric Adaptor (EFA)
interconnect with a bandwidth of 200 GB s 1 for low-latency and high-bandwidth
communication. We tested multi-node weak scalability of a DOLFINx Poisson solver
up to 512 MPI processes. We find the overall performance and weak scalability of
the c7gn-based cluster is similar to a dedicated AMD EPYC Rome x86-64 cluster
installed at the University of Luxembourg.
Disciplines :
Computer science Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
HABERA, Michal ; University of Luxembourg > Faculty of Science, Technology and Medicine > Department of Engineering > Team Andreas ZILIAN
HALE, Jack ; University of Luxembourg > Faculty of Science, Technology and Medicine (FSTM) > Department of Engineering (DoE)
Language :
English
Title :
The FEniCS Project on AWS Graviton3
Publication date :
2024
Version :
Submitted preprint
Number of pages :
10
Focus Area :
Computational Sciences
FnR Project :
FNR17205623 - Constraint Aware Optimization Of Topology In Design-for-additive-manufacturing, 2022 (01/11/2022-31/10/2024) - Michal Habera
Funders :
FNR - Fonds National de la Recherche [LU]
Funding text :
This research was funded in whole, or in part, by the National Research Fund (FNR), grant reference
COAT/17205623. For the purpose of open access, and in fulfillment of the obligations arising from
the grant agreement, the author has applied a Creative Commons Attribution 4.0 International (CC
BY 4.0) license to any Author Accepted Manuscript version arising from this submission.