Reference : State Aggregation for Multiagent Communication over Rate-Limited Channels
Scientific journals : Article
Engineering, computing & technology : Electrical & electronics engineering
http://hdl.handle.net/10993/45575
State Aggregation for Multiagent Communication over Rate-Limited Channels
English
Mostaani, Arsham mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Vu, Thang Xuan mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Chatzinotas, Symeon mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SigCom >]
Ottersten, Björn mailto [University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > >]
Dec-2020
IEEE Global Communications Conference
Yes
International
[en] Task-based information compression ; machine learning for communications ; reinforcement learning
[en] A collaborative task is assigned to a multiagent system (MAS) in which agents are allowed to communicate. The MAS runs over an underlying Markov decision process and its task is to maximize the averaged sum of discounted one-stage rewards. Although knowing the global state of the environment is necessary for the optimal action selection of the MAS, agents are limited to individual observations. The inter-agent communication can tackle the issue of local observability, however, the limited rate of the inter-agent communication prevents the agent from acquiring the precise global state information. To overcome this challenge, agents need to communicate their observations in a compact way such that the MAS compromises the minimum possible sum of rewards. We show that this problem is equivalent to a form of rate-distortion problem which we call the task-based information compression. State Aggregation for Information Compression (SAIC) is introduced here to perform the task-based information compression. The SAIC is shown, conditionally, to be capable of achieving the optimal performance in terms of the attained sum of discounted rewards. The proposed algorithm is applied to a rendezvous problem and its performance is compared with two benchmarks; (i) conventional source coding algorithms and the (ii) centralized multiagent control using reinforcement learning. Numerical experiments confirm the superiority and fast convergence of the proposed SAIC.
http://hdl.handle.net/10993/45575
H2020 ; 742648 - AGNOSTIC - Actively Enhanced Cognition based Framework for Design of Complex Systems

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
a172-mostaani.pdfAuthor preprint924.71 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.