2015 • In Proc. of the 18th Intl. Workshop on Nature Inspired Distributed Computing (NIDISC 2015), part of the 29th IEEE/ACM Intl. Parallel and Distributed Processing Symposium (IPDPS 2015)
[en] Distributed parallel computing platforms contribute for a large part to some of the most powerful computers. Such architec- tures are typically based on accelerators (General Purpose com- puting on Graphics Processing Units, Many Integrated Cores e.g Xeon Phi co-processors) and/or a large number of interconnected computing nodes. Obviously, they raise new challenges, typically in terms of scalability, robustness, adaptability and security. At the advent of the quest for Ultrascale Computing Systems, this paper addresses the issue of fault tolerance toward Byzantine failures overs such platforms. Indeed, the inherent unpredictable nature of these errors render their detection, not speaking of their correction, hard or even impossible to perform at large-scale. At this level, Algorithm-Based Fault Tolerance (ABFT) techniques where the fault tolerance scheme is tailored to the algorithm performed, seems the most promising approaches to deal with such failures. In this context, Evolutionary Algorithms (EAs), especially panmictic global parallel EAs, exhibit a remarkable resilience against byzantine failures modeled as cheating faults as demonstrated either empirically or theoretically in previous studies [1], [2]. In this paper, we extend this analysis to the case of distributed EAs based on the cellular model leading to distributed Cellular Evolutionary Algorithms (dCEAs). Our empirical study over a set or reference optimization problem confirm the ABFT nature of dCEAs. To our knowledge, this is the first study of dCEAs under the perspective of cheating issues and crash faults in a domain of distributed computations, thus opening new insights and perspectives for the design of competitive ultra-scale system based on evolutionary programming models.
Research center :
ULHPC - University of Luxembourg: High Performance Computing
Disciplines :
Computer science
Author, co-author :
Muszynski, Jakub ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Varrette, Sébastien ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
Dorronsorro, Bernabé
Bouvry, Pascal ; University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Computer Science and Communications Research Unit (CSC)
External co-authors :
yes
Language :
English
Title :
Distributed Cellular Evolutionary Algorithms in a Byzantine Environment
Publication date :
May 2015
Event name :
Proc. of the 18th Intl. Workshop on Nature Inspired Distributed Computing (NIDISC 2015), part of the 29th IEEE/ACM Intl. Parallel and Distributed Processing Symposium (IPDPS 2015),
Event date :
May 2015
Audience :
International
Main work title :
Proc. of the 18th Intl. Workshop on Nature Inspired Distributed Computing (NIDISC 2015), part of the 29th IEEE/ACM Intl. Parallel and Distributed Processing Symposium (IPDPS 2015)
F. F. de Vega, "A Fault Tolerant Optimization Algorithm based on Evolutionary Computation," in Proceedings of the International Conference on Dependability of Computer Systems (DEPCOS-RELCOMEX'06). Washington, DC, USA: IEEE Computer Society, 2006, pp. 335-342.
J. Muszyński, S. Varrette, P. Bouvry, F. Seredyński, and S. U. Khan, "Convergence Analysis of Evolutionary Algorithms in the Presence of Crash-Faults and Cheaters," Intl. Journal. of Computers and Mathematics with Applications (CAMWA), vol. 64, no. 12, pp. 3805-3819, Dec 2012.
A. S. Tanenbaum and M. van Steen, Distributed Systems: Principles and Paradigms, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006.
D. P. Anderson, J. Cobb, E. Korpela, M. Lebofsky, and D. Werthimer, "SETI@home: an experiment in public-resource computing," Communications of the ACM, vol. 45, no. 11, pp. 56-61, Nov. 2002.
Folding@home, http://folding.stanford.edu.
D. P. Anderson, "BOINC: A System for Public-Resource Computing and Storage," in Fifth IEEE/ACM International Workshop on Grid Computing. IEEE, 2004, pp. 4-10.
E. Alba and B. Dorronsoro, Cellular Genetic Algorithms, ser. Operations Research/Computer Science Interfaces. Springer-Verlag Heidelberg, 2008, vol. 42.
D. Whitley, "Cellular genetic algorithms," in Fifth Int. Conf. on Genetic Algorithms ICGA-5, S. Forrest, Ed. California, CA, USA: Morgan-Kaufmann, 1993, p. 658.
B. Manderick and P. Spiessens, "Fine-grained parallel genetic algorithm," in Third Int. Conf. on Genetic Algorithms ICGA-3, J. Schaffer, Ed. Morgan-Kaufmann, 1989, pp. 428-433.
E. Alba and M. Tomassini, "Parallelism and evolutionary algorithms," Evolutionary Computation, IEEE Transactions on, vol. 6, no. 5, pp. 443-462, Oct 2002.
E. Alba, B. Dorronsoro, M. Giacobini, and M. Tomassini, Handbook of Bioinspired Algorithms and Applications. CRC Press, 2006, ch. Decentralized Cellular Evolutionary Algorithms, pp. 103-120.
E. Alba and B. Dorronsoro, Cellular Genetic Algorithms, 1st ed. Springer Publishing Company, Incorporated, 2008.
J. I. Hidalgo, J. Lanchares, F. F. de Vega, and D. Lombrana, "Is the island model fault tolerant?" in GECCO '07: Proceedings of the 2007 GECCO conference companion on Genetic and evolutionary computation. London, United Kingdom: ACM, July 7-11 2007, pp. 2737-2744.
A. Morales-Reyes, E. F. Stefatos, A. T. Erdogan, and T. Arslan, "Towards Fault-Tolerant Systems based on Adaptive Cellular Genetic Algorithms," in Proceedings of the 2008 NASA/ESA Conference on Adaptive Hardware and Systems (AHS'08). Noordwijk, The Netherlands: IEEE Computer Society, June 22-25 2008, pp. 398-405.
D. L. González, F. F. de Vega, and H. Casanova, "Characterizing fault tolerance in genetic programming," in Proc. of the 2009 workshop on Bio-inspired algorithms for distributed systems (BADS'09). New York, NY, USA: ACM, 2009, pp. 1-10.
J. L. Laredo, P. Bouvry, D. González, F. F. de Vega, M. G. Arenas, J. J. Merelo, and C. Fernandes, "Designing robust volunteer-based evolutionary algorithms," Genetic Programming and Evolvable Machines, vol. 15, no. 3, pp. 221-244, 2014.
E. Montero and M.-C. Riff, "On-the-fly calibrating strategies for evolutionary algorithms," Information Sciences, vol. 181, no. 3, pp. 552-566, 2011.
S. Varrette, P. Bouvry, H. Cartiaux, and F. Georgatos, "Management of an Academic HPC Cluster: The UL Experience," in Proc. of the 2014 Intl. Conf. on High Performance Computing & Simulation (HPCS 2014). Bologna, Italy: IEEE, July 2014, pp. 959-967.
J. Schaffer and L. Eshelman, "On crossover as an evolutionary viable strategy," R. Belew and L. Booker, Eds.
D. Goldberg, K. Deb, and J. Horn, "Massively multimodality, deception and genetic algorithms," R. Männer and B. Manderick, Eds. North-Holland, 1992, pp. 37-46.
M. Garey and D. Johnson, Computers and Intractability: a Guide to the Theory of NP-completeness. San Franciso, California: Freeman, 1979.
D. Goldberg, K. Deb, H. Kargupta, and G. Harik, "Rapid, accurate optimization of difficult problems using fast messy genetic algorithms," S. Forrest, Ed.
D. Sudholt, "General lower bounds for the running time of evolutionary algorithms," in Parallel Problem Solving from Nature, PPSN XI, ser. Lecture Notes in Computer Science, R. Schaefer, C. Cotta, J. Kołodziej, and G. Rudolph, Eds. Springer Berlin Heidelberg, 2010, vol. 6238, pp. 124-133.