[en] Computational simulations of conformational sampling in general, and of macromolecular folding in particular represent one of the most important and yet one of the most challenging applications of computer science in biology and medicinal chemistry. The advent of GRID computing may trigger some major progress in this field. This paper presents our first attempts to design GRID-based conformational sampling strategies, exploring the extremely rugged energy response surface in function of molecular geometry, in search of low energy zones through phase spaces of hundreds of degrees of freedom. We have generalized the classical island model deployment of Genetic Algorithms (GA) to a "planetary" model where each node of the grid is assimilated to a "planet" harboring quasi-independent multi-island simulations based on a hybrid GA-driven sampling approach. Although different "planets" do not communicate to each other - thus minimizing inter-CPU exchanges on the GRID - each new simulation will benefit from the preliminary knowledge extracted from the centralized pool of already visited geometries, located on the dispatcher machine, and which is disseminated to any new "planet". This "panspermic" strategy allows new simulations to be conducted such as to either be attracted towards an apparently promising phase space zone (biasing strategies, intensification procedures) or to avoid already in-depth sampled (tabu) areas. Successful folding of mini-proteins typically used in benchmarks for all-atoms protein simulations has been observed, although the reproducibility of these highly stochastic simulations in huge problem spaces is still in need of improvement. Work on two structured peptides (the "tryptophane cage" 1L2Y and the "tryptophane zipper" 1LE1) used as benchmarks for all-atom protein folding simulations has shown that the planetary model is able to reproducibly sample conformers from the neighborhood of the native geometries. However, within these neighborhoods (within ensembles of conformers similar to models published on hand of experimental geometry determinations), the energy landscapes are still extremely rugged. Therefore, simulations in general produce "correct" geometries (similar enough to experimental model for any practical purposes) which sometimes unfortunately correspond to relatively high energy levels and therefore are less stable than the most stable among misfolded conformers. The method thus reproducibly visits the native phase space zone, but fails to reproducibly hit the bottom of its rugged energy well. Intensifications of local sampling may in principle solve this problematic behavior, but is limited by computational ressources. The quest for the optimal time point at which a phase space zone should stop being intensively searched and declared tabu, a very difficult problem, is still awaiting for a practically useful solution.
J. N. Onuchic and P. G. Wolynes, "Theory of protein folding," Current Opinion in Structural Biology, vol. 14, pp. 70-75, 2004.
P. Crescenzi, D. Goldman, C. H. Papadimitriou, A. Piccolboni, and M. Yannakakis, "On the complexity of protein folding," Journal of Computational Biology, vol. 5, no. 3, pp. 423-466, 1998.
R. Unger and J. Moult, "Genetic algorithms for protein folding simulations," Journal of Molecular Biology, vol. 231, no. 1, pp. 75-81, may 1993.
C. Levinthal, "How to fold graciously," in Mossbauer Spectroscopy in Biological Systems. University of Illinois Press: Proceedings of a meeting held at Allerton House, Monticello, Illinois, 1969, pp. 22-24.
D. J. Wales and T. V. Bogdan, "Potential energy and free energy landscapes," J. Phys. Chem., vol. 110, no. 42, pp. 20 765-20 776, 2006.
A. T. Hagler, E. Huler, and S. Lifson, "Energy functions for peptides and proteins. i. derivation of a consistent force field including the hydrogen bond from amide crystals," Journal of American Chemical Society, vol. 96, no. 17, pp. 5319-5327, aug 1974.
A. T. Hagler and S. Lifson, "Energy functions for peptides and proteins. ii. the amide hydrogen bond and calculation of amide crystal properties," Journal of American Chemical Society, vol. 96, no. 17, pp. 5327-5335, aug 1974.
D. Horvath, "A virtual screening approach applied to the search for trypanothione reductase inhibitors," Journal of Medicinal Chemistry, vol. 40, no. 15, pp. 2412-2423, 1997.
J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor, University of Michigan Press, 1975.
B. Parent, A. Kökösy, and D. Horvath, "Optimized evolutionnary strategies in conformational sampling," Journal of Soft Computing, vol. 11, no. 1, jan 2007.
J. W. Neidigh, R. M. Fesinmeyer, and N. H. Andersen, "Designing a 20-residue protein," Nature Structural Biology, vol. 9, pp. 452-430, apr 2002.
A. G. Cochran, N. J. Skelton, and M. A. Starovasnik, "Tryptophan zippers: Stable, monomeric β-hairpins," Proc Natl Acad Sci USA, vol. 98, no. 10, pp. 5578-5583, may 2001.
H. Nguyen, M. J. M, J. Kelly, and M. Gruebele, "Engineering a betasheet protein toward the folding speed limit," The Journal of Physical Chemistry B Condens Matter Mater Surf Interfaces Biophys., vol. 109, no. 32, pp. 15 182-15 186, aug 2005.
K. Vertanen, "Genetic adventures in parallel: Towards a good island model under pvm," Oregon State University, 1998.
G. M. Morris, D. S. Goodsell, R. S. Halliday, R. Huey, W. E. Hart, R. K. Belew, and A. J. Olson, "Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function," Journal of Computational Chemistry, vol. 19, no. 14, pp. 1639-1662, jun 1998.
F. Glover, J. P. Kelly, and M. Laguna, "Genetic algorithms and tabu search: hybrids for optimization," Computers and Operations Research, vol. 22, no. 1, pp. 111-134, 1995.
S. Cahon, N. Melab, and E.-G. Talbi, "Paradiseo: A framework for the reusable design of parallel and distributed metaheuristics," Journal of Heuristics, vol. 10, no. 3, pp. 357-380, 2004.