[en] We describe Utile Coordination, an algorithm that allows a multiagent system to learn where and how to coordinate. The method starts with uncoordinated learners and maintains statistics on expected returns. Coordination dependencies are dynamically added if the statistics indicate a statistically significant benefit. This results in a compact state representation because only necessary coordination is modeled. We apply our method within the framework of coordination graphs in which value rules represent the coordination dependencies between the agents for a specific context. The algorithm is first applied on a small illustrative problem, and next on a large predator-prey problem in which two predators have to capture a single prey.
Disciplines :
Computer science
Identifiers :
UNILU:UL-ARTICLE-2011-725
Author, co-author :
Kok, Jelle R.
Hoen, Eter Jan
Bakker, Bram
Vlassis, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Language :
English
Title :
Utile coordination: Learning interdependencies among cooperative agents
Publication date :
2005
Event name :
EEE Symp. on Computational Intelligence and Games
Event date :
2005
Main work title :
EEE Symp. on Computational Intelligence and Games, Colchester, Essex
Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In Proc. Conf. on Theoretical Aspects of Rationality and Knowledge.
Chalkiadakis, G. and Boutilier, C. (2003). Coordination in multiagent reinforcement learning: A bayesian approach. In Proc. of the 2ndInt. Joint Conf. on Autonomous agents and multiagent systems, pages 709-716, Melbourne, Australia. ACM Press.
Chapman, D. and Kaelbling, L. P. (1991). Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. In Mylopoulos, J. and Reiter., R., editors, Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI-91), pages 726-731, San Mateo, Ca. Morgan Kaufmann.
Claus, C. and Boutilier, C. (1998). The dynamics of reinforcement learning in cooperative multiagent systems. In Proc. 15th Nation. Conf. on Artificial Intelligence, Madison, WI.
Guestrin, C. (2003). Planning Under Uncertainty in Complex Structured Environments. PhD thesis, Computer Science Department, Stanford University.
Guestrin, C., Lagoudakis, M., and Parr, R. (2002a). Coordinated reinforcement learning. In Proceedings of the Nineteenth International Conference on Machine Learning.
Guestrin, C., Venkataraman, S., and Koller, D. (2002b). Context-specific multiagent coordination and planning with factored MDPs. In Proc. 8th Nation. Conf. on Artificial Intelligence, Edmonton, Canada.
Kok, J. R., Spaan, M. T. J., and Vlassis, N. (2004). Noncommunicative multi-robot coordination in dynamic environments. Robotics and Autonomous Systems. In press.
Kok, J. R. and Vlassis, N. (2004). Sparse Cooperative Qlearning. In Greiner, R. and Schuurmans, D., editors, Proc. of the 21st Int. Conf. on Machine Learning, pages 481-488, Banff, Canada. ACM.
McCallum, R. A. (1997). Reinforcement Learning with Selective Perception and Hidden State. PhD thesis, University of Rochester, Computer Science Department.
Shapley, L. (1953). Stochastic games. Proceedings of the National Academy of Sciences, 39:1095-1100.
Stevens, J. P. (1990). Intermediate statistics: A modern approach. Lawrence Erlbaum.
Stone, P. and Veloso, M. (2000). Multiagent systems: a survey from a machine learning perspective. Autonomous Robots, 8(3).
Sutton, R. S. and Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
Vlassis, N. (2003). A concise introduction to multiagent systems and distributed AI. Informatics Institute, University of Amsterdam. http://www.science.uva.nl/̃vlassis/cimasdai.