[en] It was recently shown that computing an optimal stochastic controller in a discounted
in finite-horizon partially observable Markov decision process is an NP-hard problem. The
reduction (from the independent-set problem) involves designing an MDP with special
state-action rewards. In this note, we show that the case of state-only-dependent rewards
is also NP-hard.
Research center :
Luxembourg Centre for Systems Biomedicine (LCSB): Machine Learning (Vlassis Group)
Disciplines :
Computer science
Identifiers :
UNILU:UL-CONFERENCE-2012-294
Author, co-author :
Vlassis, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)
Littman, M. L.
Barber, D.
External co-authors :
yes
Language :
English
Title :
Stochastic POMDP controllers: How easy to optimize?