References of "Mainassara Chekaraou, Abdoul Wahid 50024942"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailVerlet buffer for broad phase interaction detection in Discrete Element Method
Mainassara Chekaraou, Abdoul Wahid UL; Rousset, Alban UL; Besseron, Xavier UL et al

Poster (2018, September 24)

The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that extends the dynamics of granular materials or particles as described through the classical ... [more ▼]

The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that extends the dynamics of granular materials or particles as described through the classical discrete element method (DEM) by additional properties such as the thermodynamic state, stress/strain for each particle. Such DEM simulations used by industries to set up their experimental processes are complexes and heavy in computation time. Therefore, simulations have to be precise, efficient and fast in order to be able to process hundreds of millions of particles. To tackle this issue, such DEM simulations are usually parallelized with MPI. One of the most expensive computation parts of a DEM simulation is the collision detection of particles. It is classically divided into two steps: the broad phase and the narrow phase. The broad phase uses simplified bounding volumes to perform an approximated but fast collision detection. It returns a list of particle pairs that could interact. The narrow phase is applied to the result of the broad phase and returns the exact list of colliding particles. The goal of this research is to apply a Verlet buffer method to (X)DEM simulations regardless of which broad phase algorithm is used. We rely on the fact that such DEM simulations are temporal coherent: the neighborhood only changes slightly from the last time-step to the current time-step. We use the Verlet buffer method to extend the list of pairs returned by the broad phase by stretching the particles bounding volume with an extension range. This allows re-using the result of the broad phase for several time-steps before an update is required once again and thereby its reduce the number of times the broad phase is executed. We have implemented a condition based on particles displacements to ensure the validity of the broad phase: a new one is executed to update the list of colliding particles only when necessary. This guarantees identical results because approximations introduced in the broad phase by our approach are corrected in the narrow phase which is executed at every time-steps anyway. We perform an extensive study to evaluate the influence of the Verlet extension range on the performance of the execution in terms of computation time and memory consumption. We consider different test-cases, partitioners (ORB, Zoltan, METIS, SCOTCH, ...), broad phase algorithms (Link cell, Sweep and prune, ...) and grid configurations (fine, coarse), sequential and parallel (up to 280 cores). While a larger Verlet buffer increases the cost of the broad phase and narrow phase, it also allows skipping a significant number of broad phase execution (> 99 \%). As a consequence, our first results show that this approach can speeds up the total .execution time up to a factor of 5 for sequential executions, and up to a factor of 3 parallel executions on 280 cores while maintaining a reasonable memory consumption. [less ▲]

Detailed reference viewed: 21 (7 UL)
Full Text
Peer Reviewed
See detailHybrid MPI+OpenMP Implementation of eXtended Discrete Element Method
Mainassara Chekaraou, Abdoul Wahid UL; Rousset, Alban UL; Besseron, Xavier UL et al

in Proc. of the 9th Workshop on Applications for Multi-Core Architectures (WAMCA'18), part of 30th Intl. Symp. on Computer Architecture and High Performance Computing (SBAC-PAD 2018) (2018, September)

The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that ex- tends classical Discrete Element Method (DEM) (which simulates the motion of granular material ... [more ▼]

The Extended Discrete Element Method (XDEM) is a novel and innovative numerical simulation technique that ex- tends classical Discrete Element Method (DEM) (which simulates the motion of granular material), by additional properties such as the chemical composition, thermodynamic state, stress/strain for each particle. It has been applied successfully to numerous industries involving the processing of granular materials such as sand, rock, wood or coke [16], [17]. In this context, computational simulation with (X)DEM has become a more and more essential tool for researchers and scientific engineers to set up and explore their experimental processes. However, increasing the size or the accuracy of a model requires the use of High Performance Computing (HPC) platforms over a parallelized implementation to accommodate the growing needs in terms of memory and computation time. In practice, such a parallelization is traditionally obtained using either MPI (distributed memory computing), OpenMP (shared memory computing) or hybrid approaches combining both of them. In this paper, we present the results of our effort to implement an OpenMP version of XDEM allowing hybrid MPI+OpenMP simulations (XDEM being already parallelized with MPI). Far from the basic OpenMP paradigm and recommendations (which simply summarizes by decorating the main computation loops with a set of OpenMP pragma), the OpenMP parallelization of XDEM required a fundamental code re-factoring and careful tuning in order to reach good performance. There are two main reasons for those difficulties. Firstly, XDEM is a legacy code devel- oped for more than 10 years, initially focused on accuracy rather than performance. Secondly, the particles in a DEM simulation are highly dynamic: they can be added, deleted and interaction relations can change at any timestep of the simulation. Thus this article details the multiple layers of optimization applied, such as a deep data structure profiling and reorganization, the usage of fast multithreaded memory allocators and of advanced process/thread-to-core pinning techniques. Experimental results evaluate the benefit of each optimization individually and validate the implementation using a real-world application executed on the HPC platform of the University of Luxembourg. Finally, we present our Hybrid MPI+OpenMP results with a 15%-20% performance gain and how it overcomes scalability limits (by increasing the number of compute cores without dropping of performances) of XDEM-based pure MPI simulations. [less ▲]

Detailed reference viewed: 48 (9 UL)
Full Text
Peer Reviewed
See detailComparing Broad-Phase Interaction Detection Algorithms for Multiphysics DEM Applications
Rousset, Alban UL; Mainassara Chekaraou, Abdoul Wahid UL; Liao, Yu-Chung UL et al

in AIP Conference Proceedings ICNAAM 2017 (2017, September)

Collision detection is an ongoing source of research and optimization in many fields including video-games and numerical simulations [6, 7, 8]. The goal of collision detection is to report a geometric ... [more ▼]

Collision detection is an ongoing source of research and optimization in many fields including video-games and numerical simulations [6, 7, 8]. The goal of collision detection is to report a geometric contact when it is about to occur or has actually occurred. Unfortunately, detailed and exact collision detection for large amounts of objects represent an immense amount of computations, naively n 2 operation with n being the number of objects [9]. To avoid and reduce these expensive computations, the collision detection is decomposed in two phases as it shown on Figure 1: the Broad-Phase and the Narrow-Phase. In this paper, we focus on Broad-Phase algorithm in a large dynamic three-dimensional environment. We studied two kinds of Broad-Phase algorithms: spatial partitioning and spatial sorting. Spatial partitioning techniques operate by dividing space into a number of regions that can be quickly tested against each object. Two types of spatial partitioning will be considered: grids and trees. The grid-based algorithms consist of a spatial partitioning processing by dividing space into regions and testing if objects overlap the same region of space. And this reduces the number of pairwise to test. The tree-based algorithms use a tree structure where each node spans a particular space area. This reduces the pairwise checking cost because only tree leaves are checked. The spatial sorting based algorithm consists of a sorted spatial ordering of objects. Axis-Aligned Bounding Boxes (AABBs) are projected onto x, y and z axes and put into sorted lists. By sorting projection onto axes, two objects collide if and only if they collide on the three axes. This axis sorting reduces the number of pairwise to tested by reducing the number of tests to perform to only pairs which collide on at least one axis. For this study, ten different Broad-Phase collision detection algorithms or framework have been considered. The Bullet [6], CGAL [10, 11] frameworks have been used. Concerning the implemented algorithms most of them come from papers or given implementation. [less ▲]

Detailed reference viewed: 140 (38 UL)
Full Text
Peer Reviewed
See detailOn the performance of an overlapping-domain parallelization strategy for Eulerian-Lagrangian Multiphysics software
Pozzetti, Gabriele UL; Besseron, Xavier UL; Rousset, Alban UL et al

in AIP Conference Proceedings ICNAAM 2017 (2017, September)

In this work, a strategy for the parallelization of a two-way CFD-DEM coupling is investigated. It consists on adopting balanced overlapping partitions for the CFD and the DEM domains, that aims to reduce ... [more ▼]

In this work, a strategy for the parallelization of a two-way CFD-DEM coupling is investigated. It consists on adopting balanced overlapping partitions for the CFD and the DEM domains, that aims to reduce the memory consumption and inter-process communication between CFD and DEM. Two benchmarks are proposed to assess the consistency and scalability of this approach, coupled execution on 252 cores shows that less than 1\% of time is used to perform inter-physics data exchange. [less ▲]

Detailed reference viewed: 136 (51 UL)