[en] Decentralized partially observable Markov decision processes (DEC-POMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and partially observable environment. Unfortunately, computing optimal plans in a DEC-POMDP has been shown to be intractable (NEXP-complete), and approximate algorithms for specific subclasses have been proposed. Many of these algorithms rely on an (approximate) solution of the centralized planning problem (i.e., treating the whole team as a single agent). We take a more decentralized approach, in which each agent only reasons over its own local state and some uncontrollable state features, which are shared by all team members. In contrast to other approaches, we model communication as an integral part of the agent's reasoning, in which the meaning of a message is directly encoded in the policy of the communicating agent. We explore iterative methods for approximately solving such models, and we conclude with some encouraging preliminary experimental results.
Disciplines :
Computer science
Identifiers :
UNILU:UL-ARTICLE-2011-713
Author, co-author :
Spaan, Matthijs T. J.
Gordon, Geoffrey J.
VLASSIS, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Language :
English
Title :
Decentralized planning under uncertainty for teams of communicating agents
Publication date :
2006
Event name :
Int. Joint Conf. on Autonomous Agents and Multiagent Systems, Hakodate, Japan
Event date :
2006
Main work title :
Proc. Int. Joint Conf. on Autonomous Agents and Multiagent Systems, Hakodate, Japan
R. Becker, S. Zilberstein, V. Lesser, and C. Goldman. Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research, 22:423-455, 2004.
D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research, 27(4):819-840, 2002.
D. S. Bernstein, E. A. Hansen, and S. Zilberstein. Bounded policy iteration for decentralized POMDPs. In Proc. Int. Joint Conf. on Artificial Intelligence, 2005.
M. B. Dias, A. Stentz, and D. Goldberg. Market-based multirobot coordination for complex space applications. In Proc. of the 7th International Symposium on Artificial Intelligence, Robotics and Automation in Space, 2003.
R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Approximate solutions for partially observable stochastic games with common payoffs. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, 2004.
R. Emery-Montemerlo, G. Gordon, J. Schneider, and S. Thrun. Game theoretic control for robot teams. In Proceedings of the IEEE International Conference on Robotics and Automation, 2005.
M. S. Fox, M. Barbuceanu, and R. Teigen. Agent-oriented supply-chain management. International Journal of Flexible Manufacturing Systems, 12(2,3), 2000.
C. Goldman and S. Zilberstein. Optimizing information exchange in cooperative multi-agent systems. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, 2003.
C. V. Goldman and S. Zilberstein. Decentralized control of cooperative systems: Categorization and complexity analysis. In Journal of Artificial Intelligence Research, volume 22, pages 143-174, 2004.
A. Guo and V. Lesser. Planning for weakly-coupled partially observable stochastic games. In Proc. Int. Joint Conf. on Artificial Intelligence, 2005.
E. Hansen, D. Bernstein, and S. Zilberstein. Dynamic programming for partially observable stochastic games. In Proc. of the National Conference on Artificial Intelligence, San Jose, 2004.
L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99-134, 1998.
V. Lesser, C. L. Ortiz, and M. Tambe, editors. Distributed sensor networks: A multiagent perspective. Kluwer Academic Publishers, 2003.
R. Nair, M. Tambe, M. Roth, and M. Yokoo. Communication for improving policy computation in distributed POMDPs. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, 2004.
R. Nair, M. Tambe, M. Yokoo, D. Pynadath, and S. Marsella. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In Proc. Int. Joint Conf. on Artificial Intelligence, Acapulco, Mexico, Aug. 2003.
J. Ooi and G. Wornell. Decentralized control of a multiple access broadcast channel: Performance bounds. In Proc. of the 35th Conference on Decision and Control, pages 293-298, 1996.
C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of Markov decision processes. Mathematics of operations research, 12(3):441-450, 1987.
S. Paquet, L. Tobin, and B. Chaib-draa. An online POMDP algorithm for complex multiagent environments. In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems, 2005.
J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for POMDPs. In Proc. Int. Joint Conf. on Artificial Intelligence, Acapulco, Mexico, Aug. 2003.
P. Poupart and C. Boutilier. Bounded finite state controllers. In Advances in Neural Information Processing Systems 16. MIT Press, 2004.
D. V. Pynadath and M. Tambe. The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 16:389-423, 2002.
M. Roth, R. Simmons, and M. Veloso. Decentralized communication strategies for coordinated multi-agent policies. In A. Schultz, L. Parker, and F. Schneider, editors, Multi-Robot Systems: From Swarms to Intelligent Automata, volume IV. Kluwer Avademic Publishers, 2005.
T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In Proc. of Uncertainty in Artificial Intelligence, 2004.
M. T. J. Spaan and N. Vlassis. Perseus: Randomized point-based value iteration for POMDPs. Journal of Artificial Intelligence Research, 24:195-220, 2005.
D. Szer and F. Charpillet. An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs. In European Conference on Machine Learning, 2005.
S. Thrun. Monte Carlo POMDPs. In Advances in Neural Information Processing Systems 12. MIT Press, 2000.
P. Xuan, V. Lesser, and S. Zilberstein. Communication decisions in multi-agent cooperation: Model and experiments. In Proc. of the Fifth Int. Conference on Autonomous Agents, 2001.