[en] Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents in a factored representation. Factored Dec-POMDP representations have been proposed before, but only for Dec-POMDPs whose transition and observation models are fully independent. Such strong assumptions simplify the planning problem, but result in models with limited applicability. By contrast, we consider general factored Dec-POMDPs for which we analyze the model dependencies over space (locality of interaction) and time (horizon of the problem). We also present a formulation of decomposable value functions. Together, our results allow us to exploit the problem structure as well as heuristics in a single framework that is based on collaborative graphical Bayesian games (CGBGs). A preliminary experiment shows a speedup of two orders of magnitude.
Disciplines :
Sciences informatiques
Identifiants :
UNILU:UL-ARTICLE-2011-707
Auteur, co-auteur :
Oliehoek, Frans A.
Spaan, Matthijs T. J.
VLASSIS, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Whiteson, Shimon
Langue du document :
Anglais
Titre :
Exploiting locality of interaction in factored Dec-POMDPs
Date de publication/diffusion :
2008
Nom de la manifestation :
Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems
Date de la manifestation :
2008
Titre de l'ouvrage principal :
Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems
R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman. Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research (JAIR), 22:423-455, December 2004.
D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Math. Oper. Res., 27(4):819-840, 2002.
C. Boutilier, T. Dean, and S. Hanks. Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11:1-94, 1999.
C. Boutilier, R. Dearden, and M. Goldszmidt. Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2):49-107, 2000.
X. Boyen and D. Roller. Tractable inference for complex stochastic processes. In Proc. of Uncertainty in Artificial Intelligence, pages 33-42. San Francisco: Morgan Kaufmann, 1998.
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In Advances in Neural Information Processing Systems H, pages 1523-1530, 2002.
E. A. Hansen, D. S. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In AAAI '04: Proceedings of the Nineteenth National Conference on Artificial Intelligence, pages 709-715, 2004.
E. A. Hansen and Z. Feng. Dynamic programming for POMDPs using a factored state representation. In Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems, pages 130-139, Apr. 2000.
J. R. Kok and N. Vlassis. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research, 7:1789-1828, 2006.
D. Roller and R. Parr. Computing factored value functions for policies in structured MDPs. In Proc. Sixteenth International Joint Conference on Artificial Intelligence (IJCAI), pages 1332-1339, 1999.
R. Nair, P. Varakantham, M. Tambe, and M. Yokoo. Networked distributed POMDPs: A synthesis of distributed constraint optimization and POMDPs. In Proc. of the National Conference on Artificial Intelligence, pages 133-139, 2005.
F. A. Oliehoek, M. T. J. Spaan, and N. Vlassis. Optimal and approximate Q-value functions for decentralized POMDPs. Journal of Artificial Intelligence Research, 2008. (to appear).
F. A. Oliehoek and N. Vlassis. Q-value functions for decentralized POMDPs. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 833-840, May 2007.
Z. Rabinovich, C. V. Goldman, and J. S. Rosenschein. The complexity of multiagent systems: the price of silence. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 1102-1103, 2003.
M. Roth, R. Simmons, and M. Veloso. Exploiting factored representations for decentralized execution in multi-agent teams. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 467-463, May 2007.
S. Singh, V. Soni, and M. Wellman. Computing approximate Bayes-Nash equilibria in tree-games of incomplete information. In EC '04: Proceedings of the 5th ACM conference on Electronic commerce, pages 81-90, 2004.
V. Soni, S. Singh, and M. Wellman. Constraint satisfaction algorithms for graphical games. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, pages 423-430, 2007.
D. Szer, F. Charpillet, and S. Zilberstein. MAA*: A heuristic search algorithm for solving decentralized POMDPs. In Proc. of Uncertainty in Artificial Intelligence, 2005.
P. Varakantham, J. Marecki, Y. Yabu, M. Tambe, and M. Yokoo. Letting loose a SPIDER on a network of POMDPs: Generating quality guaranteed policies. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, 2007.
N. Vlassis. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2007. Enfermedades Raras