Tolérance aux fautes et reconfiguration dynamique pour les applications distribuées à grande échelle

BESSERON, Xavier

Download

Doctoral thesis (Dissertations and theses)

Tolérance aux fautes et reconfiguration dynamique pour les applications distribuées à grande échelle

BESSERON, Xavier

2010

Permalink
https://hdl.handle.net/10993/39961

Files (2)Send to Details Statistics Bibliography Similar publications

Files

Full Text

these.pdf

Author postprint (1.81 MB)

Download

Annexes

soutenance.pdf

(1.45 MB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Parallel computing; Grid computing; Dynamic adaptation and reconfiguration; Fault tolerance; Data flow graph

Abstract :

[en] This work deals with high performance computing on large scale platforms like computing grids. Computing grids are characterized by (1) frequent changes in execution context and, especially, by (2) a high failure probability caused by the large number of components. Running an application efficiently in such an environment requires to consider these parameters. Our research work is based on the abstract representation of the application as a data flow graph from the parallel and distributed programming model Athapascan/Kaapi. This abstract representation is used to provide solutions for (1) dynamic reconfiguration and (2) fault tolerance issues. - First, we propose a dynamic reconfiguration mechanism that manages, transparently for the reconfiguration programmer, concurrent operations on the application state and mutual consistency of states for distributed reconfiguration. - Secondly, we present an original fault tolerance protocol that allows partial rollback of the application in case of failure. For this purpose, the set of strictly required computation tasks to recover is computed. These contributions are evaluated through the Kaapi and X-Kaapi software on the Grid'5000 computing platform.

Disciplines :

Computer science

Author, co-author :

BESSERON, Xavier ; Université de Grenoble > Laboratoire d'Informatique de Grenoble > MOAIS project

Language :

French

Title :

Tolérance aux fautes et reconfiguration dynamique pour les applications distribuées à grande échelle

Alternative titles :

[en] Fault tolerance and dynamic reconfiguration for large scale distributed applications

Defense date :

28 April 2010

Number of pages :

223

Institution :

UJF - Université Joseph Fourier - Grenoble, Grenoble, France

Degree :

PhD in Computer Science

Promotor :

Trystram, Denis

Gautier, Thierry

President :

Mossière, Jacques

Jury member :

Cappello, Franck

Cérin, Christophe

Desprez, Frédéric

Additional URL :

https://tel.archives-ouvertes.fr/tel-00486939

Available on ORBilu :

since 23 July 2019

Statistics

Number of views

127 (9 by Unilu)

Number of downloads

203 (2 by Unilu)

More statistics