Collaborative multiagent reinforcement learning by payoff propagation

Kok, Jelle R.; VLASSIS, Nikos

Download

Article (Scientific journals)

Collaborative multiagent reinforcement learning by payoff propagation

Kok, Jelle R.; VLASSIS, Nikos

2006 • In Journal of Machine Learning Research, 7, p. 1789-1828

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/10993/11036

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

download.pdf

Publisher postprint (521.27 kB)

http://jmlr.org/papers/volume7/kok06a/kok06a.pdf

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

collaborative multiagent system; coordination graph; reinforcement learning; Q-learning; belief propagation

Abstract :

[en] In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the single-state case and describe a payoff propagation algorithm that computes the individual actions that approximately maximize the global payoff function. The method can be viewed as the decision-making analogue of belief propagation in Bayesian networks. Second, we focus on learning the behavior of the agents in sequential decision-making tasks. We introduce different model-free reinforcement-learning techniques, unitedly called Sparse Cooperative Q-learning, which approximate the global action-value function based on the topology of a coordination graph, and perform updates using the contribution of the individual agents to the maximal global action value. The combined use of an edge-based decomposition of the action-value function and the payoff propagation algorithm for efficient action selection, result in an approach that scales only linearly in the problem size. We provide experimental evidence experimental evidence that our method outperforms related multiagent reinforcement-learning methods based on temporal differences.

Disciplines :

Computer science

Identifiers :

UNILU:UL-ARTICLE-2011-715

Author, co-author :

Kok, Jelle R.

VLASSIS, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)

Language :

English

Title :

Collaborative multiagent reinforcement learning by payoff propagation

Publication date :

2006

Journal title :

Journal of Machine Learning Research

ISSN :

1532-4435

eISSN :

1533-7928

Publisher :

MIT Press, United States - Massachusetts

Volume :

Pages :

1789-1828

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

http://jmlr.org/papers/volume7/kok06a/kok06a.pdf

Available on ORBilu :

since 17 November 2013

Statistics

Number of views

136 (0 by Unilu)

Number of downloads

147 (0 by Unilu)

More statistics

Scopus citations^®

285

Scopus citations^®
without self-citations

280

WoS citations^™

209