Reference : Model-free reinforcement learning as mixture learning
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/10993/3376
Model-free reinforcement learning as mixture learning
English
Vlassis, Nikos mailto [University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > >]
Toussaint, M. [> >]
2009
Proceedings of the 26th International Conference on Machine Learning
1081-1088
Yes
26th International Conference on Machine Learning
2009
Montreal, Canada
[en] Reinforcement Learning ; Mixture Learning ; Optimal Control ; EM algorithm
[en] We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizon cases. We describe a Stochastic Approximation EM algorithm for likelihood maximization that, in the tabular case, is equivalent to a non-bootstrapping optimistic policy iteration algorithm like Sarsa(1) that can be applied both in MDPs and POMDPs. On the theoretical side, by relating the proposed stochastic EM algorithm to the family of optimistic policy iteration algorithms, we provide new tools that permit the design and analysis of algorithms in that family. On the practical side, preliminary experiments on a POMDP problem demonstrated encouraging results.
http://hdl.handle.net/10993/3376

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
09-vlassis-toussaint-ICML.pdfAuthor postprint276.05 kBView/Open

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.